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I. INTRODUCTION 


A. INTRODUCTION 

In communications systems, the limited available bandwidth forces 
communications engineers and managers to use the existing available bandwidth 
effectively. The advantages of digital communications techniques, explained in Chapter 
2, convince the communications world to switch from analog to digital 
communications. However, digital techniques require more bandwidth than analog 
techniques. In addition to bandwidth requirements, the lack of necessary bandwidth 
requires the untransmitted digits to be stored in buffers until their turn for transmission 
occurs. Hence, the less bandwidth that is available for the communications system, the 
larger the buffer size required. 

One solution to this problem is to employ a variable length coding technique, 
known as Huffman coding. In Huffman coding, the code words are assigned to each 
source alphabet symbol with respect to its usage frequency in the language. The more 
frequent symbols are assigned shorter code words, and vice versa. 

In [Ref. 1,] the Huffman coding process was modified by employing two 
modification parameters, N, and E. Modification of Huffman coding results in a 
smaller increase in average code length, with a larger decrease in variance. In [Ref. 2,] 
an additional modification parameter, K, was introduced. In his research, Akinsel 
concluded that the parameter E was the most robust. Both authors, after modification, 
calculated the reduction in bandwidth and buffer size by comparing Modified Huffman 
coding results with Block coding results. 

In this research, source symbols are also encoded by using the Modified Huffman 
coding technique. Modification is done only by employing parameter E, since it is the 
most effective of the three. The main difference of this research is to drop the less 
frequent source symbols before encoding the messages. The anticipated results are a 
reduction in average length in addition to a reduction in variance. Further, a reduction 
in the required transmission bandwidth, as well as the buffer size, is expected. The 
same idea is examined by dropping the more frequent source symbols and dropping a 


combination of more or less frequent symbols. 


1} 


B. STRUCTURE 

The structure of the remainder of the thesis is as follows. 

Chapter 2 discusses advantages and disadvantages of digital communications, 
presents a brief background in Huffman coding, modification, and introduces the idea 
of dropping symbols. 

In Chapter 3, the Turkish alphabet is encoded before and after dropping the less 
frequent source symbols. The effect of the dropping process and the modification 
parameter E on average length and variance are observed. 

The effect of dropping the less frequent symbols on the meaning of the messages 
is examined in Chapter 4. 

Chapter 5 compares bandwidth and buffer size requirements with Block coding, 
Huffman coding, Modified Huffman coding, and the dropping process, by using the 
simulation model of the communication systems. 

In Chapter 6, two other alternatives, dropping the more frequent and the more 
and less frequent symbol combinations, are briefly explained. 


A summary of results and conclusions are provided in Chapter 7. 
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Il. BACKGROUND 


A. WHY DIGITAL? 

In his book, K. Feher [Ref. 3] states that, “at the present time, most major 
operational terresterial line - of - sight and satellite microwave systems use analog FM 
modulation techniques. However, the trend in new development is such that the 
overwhelming majority of new microwave systems will employ digital methods.” 

This comment is not true just of microwave communications systems, but also 
true of all types of communications systems. In the communications engineering field, 
most of the new developments employ digital techniques, such as digital signal 
processing, digital multiplexing, digital switching, transmission techniques, etc. [Ref. 3.] 
Hence, the solution to the communications requirements is mostly satisfied with these 
digital approaches. The new additions to the existing communications networks tend to 
use digital techniques. This trend, with the support of the new developments in the 
digital area, will increase and by the end of this century almost all of the new solutions 
for the communications requirements will be digital. 

A major advantage of digital communications 1s low signal-to-noise ratio [Ref. 4.] 
In the analog communications case, all kinds of undesirable amplitude, frequency and 
phase variations, caused by either external source noise or systems hardware, are 
feeeived at the receiver. 

However, in digital transmission, while the digital pulses are also affected by the 
Same sources, the receiver extracts the orginal information simply by looking at 
“whether the received signal at the receiver at the time of sampling is either above or 
below a particular voltage threshold” [Ref. 4.] 

The required distance between stations should not be more than six thousand feet. 
For longer distance communications needs, repeaters are used very effectively. 
Repeaters can, after detecting the bit - pattern of the signal, reconstruct and transmit 
the signal without any error, either to the destination receiver or to another repeater. 

In addition to this major advantage, digital communications networks have more 
advantages and, like all the other real life systems have some disadvantages. These 


advantages and disadvantages can be summarized as follows [Ref. 5.] 


13 


1. Advantages Of The Digital Network 
a. Ease of multiplexing 

In digital communications, mostly Time Division Multiplexing (TDM) is 
used. Although time division multiplexing of analog signals is possible, “the 
vulnerability of narrow analog pulses to noise, distortion, crosstalk and intersymbol 
interference ” [Ref. 5] makes this option useless. So, for the multiplexing of the analog 
signals, Frequency Division Multiplexing (FDM) is commonly used. The TDM 
equipment cost is less expensive than the FDM equipment cost. 

b. Ease of Signaling 

In digital systems, control information (on hook/off hook, address digits, 
etc.) can be inserted and extracted from a message stream independently of the 
transmission medium. So, the transmission system can be designed separately from the 
transmission medium. Taking this one step further, control functions and their formats 
can be modified independently of the transmission subsystem. The system upgrading 
can be done without any impact on the control modules at either end of the link. 

The analog transmission systems also require special attention for control 
signaling. Many of the different analog systems require unique control signals. The 
control formats depend on the nature of both the transmission system and its terminal 
equipment. Additionally, in the interfaces between different subsystems, this unique 
control format requires the conversion of the control signals from one format to 
another. 

c. Use of Modern Technology 

Logic gates and memory, as thev are used in digital computer technology, 
can easily be used in digital signaling. The main idea in digital switching is simplv to 
use the “AND gate with one logic input assigned to the message signal and the other 
inputs used for control” [Ref. 5: p.66]. Hence, the same technological development in 
the computer logic circuits can be applied in digital integrated circuit technology. 

Large scale integrated circuits (LSI) are developed specifically for 
telecommunications. LSI chips improve the cost - effectiveness, the size and the 
reliability of the communications system that uses digital techniques. Despite the 
currently common usage of frequency division multiplexing access (FD MA), techniques 
indicate that future satellite communications will be digital. 

In fiber optic communications, the interface of electronics and optical fibers 
uses primarily the “on-off mode of operations. Hence, the transmission link itself 


emphasizes the digital mode of operation. 
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In the digital signal transmission area, which can be used for transmitting 
both analog and digital wave forms, the digital technique is used. 
d. Integration of Transmission and Switching 
In analog phone systems, the transmission and multiplexing equipment are 
considered separately, because they are functionally independent. However, in digital 
systems, since the TDM of the signal is very similar to the time division switching 
function, both can easily be integrated. The benefits of this integration are: 
(1) demultiplexing equipment at switching offices is unnecessary 
(2) greatly improved end - to - end voice quality 
(3) cable entrance requirements and mainframe distribution of wire 
pairs are reduced. 
e. Signal Regeneration 
The analog wave form is transformed into a sequence of discrete values. 
These discrete values are then represented by a sequence of binary digits. In the 
transmission each binary digit is represented by one of the two possible signal values, 
such as “pulse and no pulse” or “positive pulse and negative pulse,” etc. At the 
receiver, regeneration of the original signal is very simple, since all that is needed is to 
distinguish one of these two possible signal values. If the transmission distance is not 
very long, the effect of the external noise source on the transmitted signal will not be 
enough to change its value outside the threshold values. If the distance is longer than 
the required distance the undesired noise will be strong enough to destroy the signal. 
In order to overcome this difficulty, repeaters are stationed between the source and the 
destination. They detect the bit pattern and reconstruct and transmit the pattern to 
fhe next repeater. 
f. Ease of Encryption 
Decoding and encoding of a digital bit stream is much easier than that of 
an analog signal. This characteristic, especially for military applications of digital 
techniques, makes it very attractive. 
2. Disadvantages of Digital Network 
a. Analog - to - Digital Conversion 
One of the main expenses of the digital network is the conversion cost. 
Since most of the digital networks use existing analog networks, the savings due to 
reduced equipment, such as multipliers and switches, generally covers the conversion 
cost. 
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b. Need for Time Synchronization 

Although some transmission systems in analog networks require some sort 
of synchronization, like FDM _ transmission system carrier synchronization, 
synchronization in analog networks is not a requirement. It can be considered as a 
function of the transmission system. On the other hand, in digital networks time 
synchronization is a requirement for optimum detection at the receiver, “the sample 
clock must be synchronized to the pulse arrival time” [Ref: 5.] This problem increases 
as the number of digital transmission links and switches in the network increases. 

c. Increased Bandwidth 

In a digital system a waveform is sampled and these samples are coded into 
binary digits. For each digit, one individual pulse is transmitted. In the analog 
systems, the transmission of a waveform does not require more bandwidth than the 
underlying original wave. Hence, it can easily be seen that digital systems require more 
bandwidth than analog systems. 

Transmission quality is reached by representing the waveform with more 
digits. This increases the bandwidth. The bandwidth increase, in voice digitization, 1s 
directly dependent on the form of coding or modulation used. 

When we look at the existing local analog loop, since the bandwidth is 
underutilized, an increase in bandwidth due to digitization might not create a big 
problem. In long - haul systems, since the bandwidth utilization is high, an increase in 
bandwidth is less acceptable. 

One of the ways to overcome this major disadvantage of additional 
bandwidth requirements in digital communications is to reduce the bandwidth 
requirement by employing variable length encoding techniques. 

Contrary to the block coding technique, which is a fixed bit sequence 
length, Huffman coding assigns the bit sequence to each source symbol according to its 
frequency of occurence in the source alphabet. The higher frequency symbol is 
assigned shorter bit sequence and vice versa [Ref. 6.] 

Two important requirements of this coding technique can be stated as 
follows: 

(1) Each character should be coded with a unique bit sequence 

(2) Decoding should be done in a way that the beginning and end of 


each character is known without any special indicator [Ref. 7.] 
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As explained in the next section, although Huffman coding reduces the 
average length of code words, it introduces the variance of the code length. Reduction 
in average code length results in a gain (reduction) in bandwidth. On the other hand, a 
high variation on the code length requires a larger buffer in the system. 

One possible solution presented in this thesis is encoding messages by using 
Modified Huffman coding (with modification parameters) after dropping the less 
frequent source symbols. The expected result is smaller average length and also smaller 


variance than either block coding and Huffman coding would provide. 


B. HUFFMAN CODING 
Huffman coding is a minimum - redundancy code which uses the frequencies of 
the symbols for assigning the binary digits in encoding. The more frequent (probable) 
symbol will have the shorter length encoding [Ref. 7.] Let’s assume that there are N 
symbols in the message. P. is the probability of the ith symbol where, 1 = 1,...,N. So, 
Se 


L. is the length of ith encoding. The average length of the code is [Ref. 6:] 


ive ~ Xz P. L; 


We can rewrite the symbols according to their probabilities in decreasing order: 


and, for an optimum code (minimum - redundancy code), lengths in increasing order: 


L,; = L> S13 5..5 Ly 


The Huffman binary coding procedure begins with arranging the symbols in 
order of decreasing probabilities. Then, the two least probable symbols are combined 
into one symbol. The new symbol’s probability is the sum of the two least probable 
symbols’ probabilities. The new symbol is placed in decreasing probability order. And 
again the two least probable symbols are combined. The process is repeated until we 


have just two symbols remaining. At that point we can assign the codes to the 


symbols. For the sake of an example, let’s assign 0 to the upper symbol and | to the 
lower symbol. Consider the following example [Ref. 2.] 
We have our source alphabet and symbol probabilities. First we write them in 


decreasing order. See Figure 2.1. 


Probability 


MANN) 
On BW 





Figure 2.1 Source Alphabet And Symbol Probabilities. 


Now we combine the two least probable symbols (S5, S6) and place the new 
symbol into decreasing order ( Figure 2.2.) We keep combining the two least probable 
symbols until we have just two symbols ( Figure 2.3.) The least probable symbols are 


shown with #, and the combined one with *. 


Symbol 
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Figure 2.2. Reduction Process. 


We assign (0) and (1) to the last two symbols. In our example, 1 1s assigned to 
the upper symbol and 0 1s assigned to the lower symbol. Now we are ready to trace 
backwards until we assign codes to each symbol in the alphabet. On the way back, the 
combined symbols expand into two branchs. We add one more digit for each branch. 


Figure 2.4 shows the splitting process and Figure 2.5 shows the assigned code words. 


te © P 


0.4 0.4 *0.6 
0.2 0.4# 0.4 


0.24 *0.2# 
*0.2# 





Figure 2.3 Huffman Coding Reduction Process. 


Symbol P P P P P 


Sl 0.4(1) 0.4(1) 0.4(1) 0.4(1)  *0.6(0) 
$2 0.2(01) 0.201) 0.2(01)  *0.4(00)# 0.4(1) 


$3 0.2(000) 0.2(000) 0.2(000)# 0.2(01)# 
$4 0.1(0010) 0.1(0010)# *0.2(001)# 

$5 0.05(00110)# + ~—-*0.1(0011)4 

$6 —-0.05(00111)# 





Figure 2.4 Splitting Process. 


C. MODIFICATION OF HUFFMAN CODING 

In the example given in section B, the combined symbols are placed as low as 
possible in the decreasing probabilities order. Using the given symbols probabilities 
and final code length of each symbol (Figure 2.5), we can calculate the average code 
length as follows. 


Average code length : 


= yieaL. so 
L = (0.4)1 + (0.2)2 + (0.2)3 + (0.1)4 + (0.05)5 + (0.05)5 
L = 2.3 
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Symbol Code Word 


Sl 
S2 
S3 
S4 
32 
S6 





Figure 2.5 Final Code Words. 


and the variance 1s 
V = 0.4(1 - 2.3)? + 0.2(2 - 2.3)% + 0.2(3 - 2.3)? + 
0.1(4 - 2.3)7 + 0.05(5 - 2.3)* + 0.05(5 - 2.3)? 
V = 1.81 

On the other hand, if we place the combined symbol as high as possible in the 
decreasing probability order, see Figure 2.6, we will have different lengths for source 
symbols (2, 2, 2, 3, 4, 4). The average length and the variance are: 

L = 0.4(2) + 0.2(2) + 0.2(2) + 0.1(3) + 0.05(4) + 0.05(4) 
L = 2.3 
V = 0.4(2 - 2.3)° + 0.2(2 - 2.3% + 0.2(2 - 2.3) 
+ 0.1(3 - 2.3)% + 0.05(4 - 2.3)* + 0.05(4 - 2.3) 
V = 0.41 

Although the average lengths are the same, placing the combined symbols as 
high as possible gives us a smaller variance. This is a desirable result for 
communications systems. We want to have smaller length codes with small variances 
than block coding provides. 

As explained in [Ref. 1] and [Ref. 2,] by employing three parameters K, N, and E, 
we can achieve lower variance Huffman codes, with a higher average length. The initial 
decrease in variance is much more than the increase in the average length, so the 
system will have a marginal profit. 

1. Parameters 

a. Parameter N 
N 1s an integer value. Instead of placing the combined symbol in decreasing 


order with respect to its probability, is is placed in a higher position according to the 
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Symbol 


S1 
S2 
S3 
54 
SS 
56 


P P E P P 


0.4(00) 0.4(00) 0.4(00) *0.4(1) *0.6(0) 
0.2(10) 0.2(10) 0.2(01) 0.4(00)# 0.4(1) 
0.2(11) 0.2(11) *0.2(10)# 0.2(01)# 


0.1(010)  *0.1(010)# 0.2(11)# 
0.05(0110)# 0.1(011)# 
0.05(0111)# 


Figure 2.6 Huffman Coding Combined Symbols Are In Higer Position. 


value of N. If N is 3, then the combined symbol is moved 3 positions higher than it 


would otherwise be. The effect of N for the same example as given in section A, can 


be seen in Figure 2.7. 





Figure 2.7. First Reduction For N = 3. 


Huffman coding originally places the combined symbol just below the (0.1) 


value, since the probabilitv is (0.1). But when N is 3, it 1s placed between (0.4) and 


(0.2). If we set N = 0.0, it gives the original Huffman coding. 


b. Parameter K 


K is an integer value. The probability of the combined value is multipled bv 


K then the result is used as the new probability. The combined symbol is placed in 


decreasing probability order with respect to this new probability. Obviously if we set K 


= |, we have the original Huffman coding. The first reduction of the modification, 


When K = 4, can be seen in Figure 2.8. 
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Figure 2.8 First Reduction For K = 4. 


c. Parameter E 
E is a real number which is added to the combined value probability. So, 
the combined svmbol is placed in decreasing probability order, with respect to this new 
result probability. The first reduction of modification process for E = 0.2 is given in 
Figure 2.9. Final code words and the average code length and variance are given in 


Figure 2.10. If we set E = 0.0, the original Huffman coding is reached. 





Figure 2.9 First Reduction Her beau, 


After using parameters N, K, and E at every step, the sum of the 
probabilities is no longer equal to 1.0. This does not affect the coding process and will 
be shown in Section D. 

2. The Most Effective Parameter. 
In {Ref. 2] the author used the parameters N, K, and E one at a time. As a 
result, the parameter E was found to be the most robust parameter because it provided 
better codes than parameters N and K. In my research I[ will use parameter E. but 


neither N nor K. 


Ze 


Symbol P p P P P 


S1  0.4(00) 0.4(01) *0.5(00) *0.7(1) *1.1(0) 
$2. 0.211) *0.3(10) 0.4(01) 0.5(00)# 0.7(1) 


$3 0.2(000) 0.2111) 0.3(10)# 0.4(01)# 
$4. 0.1(001) 0.2(000)# 0.2(11)# 

$5 0.05(100)# 0.1(001)# 

S6 —-0.05(101)# 





Figure 2.10 Modified Huffman Coding For E =0.2. 


3. Dropping The Less Frequent Symbols 
The main idea of this research is to code the source alphabet in such a way 
that as a result we will have a short average length and small variance. The general 


variance formula is 


V =P) P.(X,-L)? 
where, X. = code length of the ith symbol. 


The code lengths which are further away from the mean average length cause 
a large variance,since the second term of the above formula is squared. So, as a simple 
idea, if we bring them close to the mean, we have a small variance. In Huffman and 
Modified Huffman coding, the reasons for the large variance are the less frequent 
symbols, which have long code lengths individually. If we leave them out of our 
variance calculations one step earlier and out of the coding process, we can achieve a 
smaller variance. 

In summary, the strategy is to determine the symbol frequencies, drop the less 
frequent symbols and then code using modified Huffman coding. 

This idea is explained in the following calculations for the same source 
alphabet given in section B. For the sake of example, let’s assume that symbols which 
have a probability less than 0.1 will be considered the less frequent symbols. The 
probability (P = 0.1) will be called the threshold, or limut, probability. In our example 
we have two symbols, S5 and S6, which have probabilities less than 0.1. Their 


Zs 


probabilities are 0.05 and 0.05, respectively. In the first step we drop these two 
symbols. See Figure 2.11. 


NNN 
An RWwroe 





Figure 2.11 Dropping Process For P= 0.1. 


The second step is coding the rest of the symbols. For comparing the different 
techniques, the source alphabet will be coded, first by using Huffman coding for E = 
0.2. Figure 2.12 shows the Huffman coding after the dropping process and Figure 2.13 
shows the modified Huffman coding (E = 0.2) after the dropping process. 


Symbol P P P 


S1 0.4(1)  0.4(1) *0.5(0) 


S2 0.2(01) *0.3(00)# 0.4(1) 
S3 0.2(000)# 0.2(01)# 
S4 0.1(001)# 





Figure 2.12 Huffman Coding After Dropping for P=0.1. 


The average code length and variance calculation results of each coding 
technique are given in Figure 2.14. 

As shown in Figure 2.14, dropping the less frequent svmbols not onlv gave us 
a smaller variance, but also gave us a shorter code length. For Huffman coding. the 
reduction in average length after the dropping process is 26% and the reduction in 


variance is 60.16%. For modified Huffman coding, E = 0.2, the reduction in average 
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P P P 


0.4(00)  *0.5(1) —-*0.8(0) 


0.2101) 0.4(00)# ~—0..5(1) 
0.2(10)# 0.2(01)# 
0.1(11)# 





Figure 2.13 Modified Huffman Coding for P=0.1 and E = 0.2. 


Huffman Modified H. Modified H. Modified H.C. 
After Drop. After Drop. 


Aver. 2.3 Ae aa 1.8 


Var. 1.81 0.24 





Figure 2.14 Results Of The Four Different Coding Techniques. 


length is 25% and in variance is 85%. If we compare Huffman coding and modified 
Huffman coding after dropping process, the reduction in average length is 21.7% and 
in Variance 1s 98%. 

At this point one important question arises, “What is the effect of dropping 
the less frequent symbols on the meaning of the original message?” To maintain the 
meaning of the messages, we need to choose a threshold probability P, for a given 
source alphabet, in a wav that dropping the less frequent symbols does not affect the 
message meaning. 

In Chapters 2 and 3, an optimal threshold probability (P), will be derived for a 
given source alphabet. In addition to maintaining message meaning, this optimal 


threshold probability will give us smaller average length and variance. 
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D. NORMALIZATION 

Statistically, it is required that the sum of the symbol probabilities in the given 
source alphabet must be 1.0. During the modification of Huffman coding for K and E, 
at every step in the reduction process, the sum of the probabilities is no longer 1.0. In 
Figure 2.8 the sum of the symbol! probabilties is 1.3 and in Figure 2.9 it is 1.2. 

In [Ref. 2: p.18], it is shown that during the modification of Huffman coding 
using E and K, the same code words would be obtained with or without normalization. 
Similiarly, after dropping the less frequent symbols, the sum of the probabilities is not 
equal to 1.0. In Figure 2.11, it is 0.9. Figure 2.15 shows the Huffman coding after the 
dropping process with normalization. For normalization each symbol! probability is 
divided by 0.9. The code words, which are obtained in Figure 2.12 and Figure 2.15 are 


exactly the same. Since the same code words are reached, will not apply normalization. 


Symbol! P Normalized P P P 


S1 0.4 0.44(1) 0.44(1)  *0.56(0) 


S2 0.2 0.22(01) *0.34(00)# 0.44(1) 
S3 0.2 0.22(000)# 0.22(01)# 
S4 0.1 0.12(001)4 





Figure 2.15 Huffman Coding After Dropping With Normalization. 
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Hl. MODIFICATION OF HUFFMAN CODING FOR TURKISH 
ALPHABET 
A. SYMBOL FREQUENCIES IN TURKISH ALPHABET 

As mentioned in Chapter 1, Huffman coding is a minimum - redundancy code, 
which uses the frequencies of the source symbol alphabet. The frequencies of the 
symbols in the Turkish alphabet were calculated in [Ref. 1] and [Ref. 2] by using the 
article given in Appendix A. The frequencies and the symbol probabilities of the 
Turkish alphabet are given in Table 1 and Table 2, respectively. 

These frequencies approximate the real Turkish alphabet frequencies. The main 
difference is that some letters which are particular to the Turkish alphabet are not on 
the keyboard. During experiments these letters are represented by other Turkish letters 
in a way that the entire script can be understood by a Turkish reader with its original 


meaning. These particular letters and their representations are given in Figure 3.1. 


Particular Letter Representative 


G 





Figure 3.1 Particular Turkish Letters And The Representatives. 
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TABLE 1 
SYMBOL CHARACTERISTICS OF THE TURKISH ALPHABET 


Symbol Frequecy Cum. Freq. Pereent Cum.Percent 
‘ 182 TS2 1 Oe 1.017 
( NZ 194 0.067 1.084 
) 1 209 0.084 Tl. Joe 
; dur 220 0.061 1.229 
- 3 225 0.017 1.246 

space 2500 2610 13339 14.585 
: aad Zoao 1.224 TS .6e9 
? i 2830 0.006 15.814 

6 2836 0.034 15.848 
PS) 2865 0.162 16.02% 
o 2885 0.112 16.122 
A 1687 4572 9.427 25.549 
B S30 4909 Pe oos 27.432 
c 223 5202 Les) 29.078 
D 628 5830 BeoUs 32.5% 
E 1423 1255 Teo 40.531 
F 64 13g OR Bibs 40.889 
G 397 7708 2-405 43.073 
H 104 7812 Ue sor 43.655 
E 1884 9696 1O.0 25 54.183 
J 9704 0.045 54.227 
K 691 TUS35 acon 58.0182 
L ote Lists 2.130 63.212 
M 527 11840 2.945 66.164 
N ites P3023 Geol lana 
0 476 13499 2.660 75.434 
P i235 FS622 0.687 16.22 
R 1089 14711 Beus5 825207 
S 713 15424 3.984 S65192 
T S75 foeao Seco 89.405 
U 924 16925 Los 94.568 
V P56 707s Orion 2 95.440 
W q 17086 0.039 95.42 
x 17087 G7006 95.485 
it 480 Pio? Z.05c 98.164 
Z 172 17774 On o30 997 loa 
0 35 ie Oe o5 99°, 352 
1 2 7 s0s 0.134 99.486 
2 16 17ers 0.029 99. um 
3 Ts i esZ On073 99.648 
4 Ez 17844 02067 997 755 
2 12 17853 0.084 99.722 
6 8 17367 0.045 99.844 
7 5 TvS872 02025 997 a 
8 13 17885 O08 99.944 
2 10 Iie2s 0.056 100.000 
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TABLE 2 
SYMBOL PROBABILITIES 


Symbol Probability Symbol Probability 


00358 
.00196 
-00162 
.00134 
-00112 
.00089 
-00084 
-00084 
.00073 
.00073 
~00067 
00067 
-00061 
-00056 
~00045 
-00045 
.00039 
.00034 
.00028 
.00017 
.00006 
-00006 
-00000 
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B. ate OF THE CODES BY USING MODIFIED HUFFMAN 


Code representations are assigned to the Turkish alphabet, in the same procedure 
explained in section 1.A. Because of the length of the source alphabet, a computer 
program written in (LISP) language is used (Appendix B) [Ref. 1.] This program was 
run 50 times with the different values of the modification parameter E, from 0.0005 to 
10.00. The output of the program are code words for each source symbol, average code 
length and variance. For E > 0.3, all the results show constant average length 
(5.08843) and constant variance (0.08061). Table 3 gives the average lengths and 
variances for each E value. 

Generally a small increase in average length can give us a large reduction in 
variance. In other words, for smaller variances, there is a tradeoff in larger average 
length. The smallest variance (0.08061) corresponds the largest average length (5.08843) 


and vice versa. Some results demonsrated different behaviors. For example, average 
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length and variance are both increased (E = 0.0005 , average length = 4.30858, 
variance = 1.92289), but, the general tendency is clear. The trade - offs between 
average length and the variance are given in Figure 3.2. Table 4 gives the E values, 
average length and variance, in increasing order for average length. Table 5 also gives 
the same values in increasing order for variance. , 

During the experiments, one point is observed. The value of E is very effective up 
to a limiting value, but for values greater than this limiting value, the effect of E 
decreases. Figure 3.3 compares E and average length. If this figure is examined, it can 
be seen that up to E = 0.30, the increase in average length becomes constant. The 
same observation can be made in Figure. 3.4. In this figure, as E increases, the 
variance decreases. For E > 3.0, the variance also becomes constant. Figure 3.5 shows 
the effect of E on the average length and variance. 

In Figure 3.2 the points closest to both axes are the extreme points and their 
values (E, average length and variance) are given in Figure 3.6. The graphic of these 
points, average length versus variance, is given in Figure 3.7. The code words which 
belong to these selected extreme codes are given in Table 6. 

As mentioned, a gain (decrease) in the variance could be the result of a loss 
(increase) in average length. Table 7 gives the loss in average length and gain in 
variance for each experimental code. In the third column, negative variance gain 
indicates a loss. This is an exceptional case. Figure 3.8 shows the increase in the 


variance gain while the average length loss increases. 
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Variance 


1.80+ 


Average 





Figure 3.2. Average Length vs. Variance. 


a1 


TABLE 3 
RESULTS IN INCREASING E VALUE ORDER 


E Average Variance 


0.0 4.30771 1.91820 
Weovos 4.30858 Tee? 
0.0010 4.31181 1.74548 
OROOTS 4.31154 Tae 
0.0020 4.31371 1.42055 
OL00Z5 4.31394 Toe7is 
0.0030 4.31293 1.93224 
020035 4.31727 1.44681 
0.0040 4.31583 1.43246 
0.0045 4.31199 Poisage 
0.0050 4.31961 Lego 
OS00D5 4.32575 Lest 0eZ2 
0.0060 4.32217 1.42507 
0.0065 4.32397 Tete 
0.0070 4.32046 1.34321 
OsO0n 5 4.33357 1.36628 
0.0080 4.32/52 base Og 
0.0085 4.33118 [eso 762 
0.0090 4.33145 1.35891 
0.0095 4.34194 L.sess6 
0.0100 4.34066 leocoDe 
O01 50 4.36739 1.24489 
0.0200 4.37334 les ee 
0.0250 4.39608 Te oc7 Lo 
0.0300 4.39537 1.36975 
0.0350 4.47384 0.76056 
0.0400 4.38381 OE seZ210 
0.0450 4.47201 One eLoo 
0.0500 4.47085 Weiss 
8) 0)12)9, 4.44680 O829>5 
0.0600 4.56179 0.48224 
0.0650 4.46935 0.54266 
0.0700 4.48795 Of 51705 
0.0750 4.59420 0.41606 
0.0800 4.49995 0.50834 
0.0850 4.48231 0.50933 
0.0900 4.48231 0.50933 
O72 000 4.57556 0.42755 
0.1500 4/5560 0.34298 
0.2000 4.68298 0.42142 
0.2500 4.8779 0.16814 
0.3000 5.08843 0.08061 
0.3500 5.08843 0.08061 
0.4000 5.08843 0.08061 
0.5000 5.08843 0.08061 
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TABLE 4 
RESULTS IN INCREASING AVERAGE LENGTH ORDER 


Average E Variance 
4.30771 0.00 1.91820 
4.30858 0.0005 i 99Z64 
4.31154 G.0015 Lt. SOs 2 
4.31181 0.0010 1.74548 
4.31199 0.0045 ls 73289 
4.31293 0.0030 1.93224 
4.31371 0.0020 1.42055 
4.31394 0.0025 1.39718 
4.31583 0.0040 1.43246 
4.31727 0.0035 1.44681 
4.31961 0.0050 aro 
4.32046 0.0070 1.34321 
4.32217 0.0060 1.42507 
4.32397 0.0065 leJotoL 
4.32575 v-00s5 oOo 
4.32759 0.0080 Iesso7 08 
4.33118 0.0085 esavee 
4.33145 0.0090 1. 358ou 
4.33357 C-007e Pascoe 
4.34066 0.0100 Peseson 
4.34194 0.0095 i. s7sae 
4.36739 0.0150 1.24489 
4.37334 0.0200 t. 35922 
4.38381 0.0400 0.89210 
4.39537 0.0300 laSe97s 
4.39608 002506 1.38716 
4.44680 0.0550 0.62955 
4.46935 0.0650 0.54266 
4.47085 0.0500 Oso 079 
4.47201 0.0450 0.76166 
4.473384 0.0350 0.76056 
4.48231 0.0850 Osu dee 
4.48231 0.0900 0.50933 
4.48795 0.0700 O.5t705 
4.49995 0.0800 0.50834 
4.56179 0.0600 0.48224 
4.5/556 Oo. 1000 0.42755 
4.59420 0.0750 0.41606 
4.68298 0.2000 0.42142 
4.73389 OnLoee 0.34298 
4.89779 0.2500 0.16814 
5.08843 0.3000 0.08061 
5.08843 0.3500 0.08061 
5.08843 0.4000 0.08061 
5.08843 a. 5000 0.08061 
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TABLES 
RESULTS IN INCREASING VARIANCE ORDER 


Variance E Average 
0.08061 0.5000 5.08843 
0.08061 0.4000 5.08843 
0.08061 0.3500 5.08843 
0.08061 0.39000 5.08843 
0.16814 0.2508 4.89779 
0.34298 0.1500 4.73389 
0.41606 0.0750 4.59420 
0.42142 0.2000 4.68298 
0.42755 0.1000 4.57556 
0.48224 0.0600 4.56179 
0.50834 0.0800 4.49995 
0.50933 0.08505 4.46738 
0.50933 0.0900 4.48231 
O. 52705 O-0700 4.48795 
0.54266 0.0650 4.46935 
On oo 0.0500 4.47085 
0.76056 0.0350 4.47384 
OO 7olGe 0.0450 4.47201 
0.82955 0.0550 4.44680 
0.89210 0.0400 4.38381 
1.24489 O70120 4.36739 
1.34321 0.0070 4.32046 
1. S05Z2 0.0200 4.37334 
PS o703 0.0080 4.32759 
1.359891 0.0090" 4.32045 
1535962 0.0085 4.33118 
1.36626 0.0075 4.33357 
Lose OZ 0.0055 4.32575 
Lagesor 0.0100 4.34066 
Lisavus 020250 4.39608 
Ligeg7s 0.0300 4.39537 
1.39353 0.0055 4.34194 
1.39718 ~ 0.0025 eds sieee 
1.42055 0.0020 4.31371 
1.42507 0.0060 4.32217 
1.43246 0.0040 4.31583 
1.44681 0.0035 4.31327 
Lae, 0.0050 4.31961 
Tsis2eo 0.0045 4.31039 
1.74548 O- 00190 4.31181 
lato 7ot 0.0065 4.32397 
1. 93Z0 O20 4.30771 
1.93224 0.0030 4.31293 
L.970g2 0.0015 4.31154 
1.99269 0.0005 4.30858 
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Average 


5.00+ 





Figure 3.3 Effect of the parameter E on average length. 
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Figure 3.4 Variance vs. Parameter E. 
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Figure 3.5 Effect of E on Average And variance. 
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Figure 3.6 Results And The E Value Of The Experimental Codes. 
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Figure 3.7 Average Lengths vs. Variance. 
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Figure 3.8 Loss In Average Length vs. Gain In Variance. 
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TABLE 6 
CODE WORDS OF THE EXPERIMENTAL CODES 


Symbol Code Word Symbol Code Word 


10010011 
100100100 
TOOTOTTOn 
0000011100 
0000011101 
LOOTGGT Or 
1001011001 
LO01TORIGeg 
1001011110 
10010TTIOF 
1001011111 
00000111010 
Oa : 00000111011 
10011 00000111101 
000000 10010010101 
000010 10010010100 
001100 10010111000 
OOTTO! LOOLTOT Tove 
0000010 000001111001 
0000110 0000011110000 
0000111 00000111100011 
1001000 000001111000100 
1001010 000001111000101 
00000110 


Bags 010 
; 101 


PAADOWUr-N =e -O"7l 


E 
N 
R 
U 
L 
S 
K 
D 
i 
M 
o 
0 
G 
B 
C 


Om VE sdee BONG Or 


Tu<dN- 


E= 0.0 (Huffman Coding) 
Average = 4.30771 
Variance = 1.91820 
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TABLE 6 
CODE WORDS OF THE EXPERIMENTAL CODES 


Symbol Code Word Symbol Code Word 


10111110 
000111110 
001100000 
0001011110 
1101011110 
OOM EO EO 
TOPPOLEII0 
PEO 0 
OLEVOLTII0 
LEOOP ETT 10 
PA OO 0100111110 
01110 01011100000 
01000 ; 01001101111 
11001 00111100000 
000000 01111100000 
010000 10111100000 
001100 11111100000 
101100 00101011110 
0100000 LOTOEOTLELIO 
0110000 011011100000 
1110000 011001011110 
1110000 111011100000 
ae 0 PELOOLOLITELO 
01100000 
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LtvaAN-: 


E = 0.0005 
Average = 4.30858 
Variance = 1.92290 
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TABLE 6 
CODE WORDS OF THE EXPERIMENTAL CODES 


Symbol Code Word Symbol Code Word 


00111110 
010011011 
1001101100 
0101101100 
0011101100 
0111101100 
0000011011 
1111101100 
0100011011 
1000011011 
PELoo 1100011011 
01110 OOLOLTI Tie 
01001 ; LO1011 Tite 
11001 OLTOLI Tie 
OLOT1 O11O01TIGTt 
010000 11107 rie 
001100 11100710 
011110 00001101100 
Pe 10001101100 
0110000 01101101100 
1110000 O101110Tios 
0101100 1110110Tfes 
Pe 11011101100 
FOP G Ie 


Space 
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em -oO7 


10100 
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E = 0.001 
Average = 4.31181 
Variance = 1.74548 
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TABLE 6 
CODE WORDS OF THE EXPERIMENTAL CODES 


Symbol Code Word Symbol Code Word 


11001000 
00011100 
10011100 
01011100 
11011100 
COEEETOG 
01111100 
LOR GeO 
Lidi T1100 
011000100 
111000100 
0000000100 
0100000100 
1000000100 
1100000100 
0010000100 
0110000100 
1010000100 
1110000100 
0001000100 
1001000100 
0101000100 
1101000100 


Space 101 


P~OWUPrN =e -O'TF 


00110 
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LOT 
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110100 
000011 
100011 
0100100 
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E = 0.04 
Average = 4.38381 
Variance = 0.89202 
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TABLE 6 
CODE WORDS OF THE EXPERIMENTAL CODES 


Symbol Code Word Symbol Code Word 


S 


my 
a 
om 


00010011 
10010011 
101101000 
011101000 
111101000 
000010000 
100010000 
0010101000 
0110101000 
1010101000 
1110101000 
0001101000 
01001 : 1001101000 
11001 01000101000 
00011 11000101000 
001000 00100101000 
001100 01100101000 
101100 10100101000 
110011 11100101000 
1010000 000000101000 
0110000 100000101000 
1110000 010000101000 
1010011 110000101000 
10010000 
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E = 0.0045 
Average = 4.31199 
Variance = 1.73289 
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TABLE 6 
CODE WORDS OF THE EXPERIMENTAL CODES 


Symbol Code Word Symbol Code Word 


Space 


P-~ 00 WUrA) rie - oO" 
OrPrOOCF OF OF FO 


001111100 
FOLEOLOOO 
Od EEE © 
Cee Oo 
P80 
0000111001 
1000111001 
0100111001 
1100111001 
0000101000 
1000101000 
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E = 0.0055 
Average = 4.32575 
Variance = 1.37092 
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TABLE 6 
CODE WORDS OF THE EXPERIMENTAL CODES 


Symbol Code Word Symbol Code Word 


101001 
011001 
117002 
0011101 
LOTTTO# 
Olt oF 
Titi 
OOCT TOM 
0101101 
1001101 
LIGETor 
OOLT ECs 
TOL Toe 
0111011 
Lilioct 
00001101 
01001101 
10001101 
00101101 
LLOOT EG 
01101101 
10101101 
LTTOrirom 


apes 


me -O't 
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LVUAN-: 


E = 0.075 
Average = 4.5942 
Variance = 0.41607 
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TABLE 6 
CODE WORDS OF THE EXPERIMENTAL CODES 


Symbol Code Word Symbol Code Word 


1001011 

0101001 

1101001 

0011001 

LORLOOT 

0111001 

1111001 

00001101 
01001101 
10001101 
COLO OT 
00100 EEOOTTOL 
10100 ; LOLOT1 OL 
01100 OTTO O 1 
11100 00011101 
00110 ieeOn Lot 
Ton 0 10011101 
01110 01011101 
Jere O Onan a 
001010 00111101 
101010 OEE O 1 
011010 TOR OT 
Pino 20 ae Ol 
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E = 0.08 
Average = 4.49995 
Variance = 0.50838 
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CODE WORDS OF THE EXPERIMENTAL CODES 


Symbol 


apace 
A 


Code Word 


Lod 
OOl1 
Og 
01100 
11100 
00010 
10010 


TABLE 6 


Symbol 


re -O"7) 


Code Word 


101000 
011000 
111000 
000100 
100100 
010100 
110100 


01010 
11010 
00110 
10110 
OVETO 
EEi10 
00001 
10001 
01001 
FLOOL 
00101 
10101 
000000 
100000 
010000 
110000 
001000 


0000111 
L000 Di 
OLOOL 


P-—™ OO ly UMA) 


E 
N 
R 
U 
L 
S 
K 
D 
T 
M 
ae 
0 
G 
B 
Cc 


tO SV bw) SH O~- 


Litan-: 


E = 0.15 
Average = 4.73389 
Variance = 0.34298 
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TABLE 6 
CODE WORDS OF THE EXPERIMENTAL CODES 


Code Word 


Symbol Code Word Symbol 


space 
i 


E 
N 
R 
U 
L 
S 
K 
D 
ic 
M 
= 
0 
G 
B 
C 


Lid 

00000 
10000 
01000 
11000 
00110 
Heo 
OTTO 
11 10 
00001 
10001 
01001 
11001 
00101 
10101 
01101 
TELoT 
00011 
10011 
01011 
OO) 
LEO 
PO 


- -O'" 


&-—™ 00 ) UT A) 


010010 
100010 
110010 
001010 
011010 
101010 
TLETOLO 
0000100 
1000100 
0100100 
1100100 
0010100 
1010100 
0110100 
EELoveo 
0001100 
1001100 
0101100 
1101100 
0011100 
OLE OO 
TOLL TOO 
PET oOe 


tO oe) Be swdee SONY OW 


LUI aN: 


000010 


E = 0.25 
Average = 4.89779 
Variance = 0.16814 
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TABLE 6 
CODE WORDS OF THE EXPERIMENTAL CODES 


Symbol Code Word Symbol Code Word 


apace 
A 


E 
N 
R 
U 
L 
S 
K 
D 
ty 
M 
x 
O 
G 
B 
€ 


LIAN: 


E = 20-30 


TEEOO 
00010 
10010 
01010 
11010 
00101 
Toro] 
01101 
11101 
00011 
10011 
OFOTL 
PO yi 
00111 
LORE 
OE 
Turi 
000000 
100000 
010000 
110000 
001000 
101000 
011000 


Average = 5.08843 
Variance = 0.08061 
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P-~ OO UG) UT ND 


tO 4° bE ~dee SONG Ore 


111686 
000110 
100110 
010110 
TIG ue 
001110 
OLMaG 
10 Tare 
000001 
Lie 
010001 
100001 
110@Oz 
001001 
011007 
101001 
11 f00u 
000100 
100100 
010100 
001100 
110100 
101100 
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C. MODIFIED HUFFMAN CODING AFTER DROPPING THE LESS 
FREQUENT SYMBOLS 


In dropping the less frequent source symbols, the main idea is to set the limit 
probability = P. The symbols which have a lower probability value than the limit 
probability are dropped. If the P value is very high, the meaning of the message might 
be disturbed. On the contrary, if the P value is very small, the dropping process will 
have little or no effect on the average code length and variance. 

In this section, | examined seven different P values. At each step, I dropped the 
symbols with probabilities lower than P and ran the same LISP program for the 
experimental E values given in Figure 3.6. 

One point must be mentioned. In every type of message the numbers have a 
very important place. Hence, when the numbers are represented numerically, even if 
they have a lower probability than the limit probability, they are not dropped. 

At each step, the effect of dropping the source symobls on the meaning of the 
message 1s the subject of the next chapter. Here I examined the technical aspect. In 
other words, disregarding the meaning of the information, I increased the P value and 
examined the changes of the average code length and variance for experimental E 
values. 

The limit probabilities (P) were chosen arbitrarily. These limit probabilities and 
corresponding step numbers are given in Figure 3.9. 

At every step symbols which have lower probabilities than the P value are 
dropped. Table 8 shows the dropped svmbols in each step. 

The results were examined in two dimensions. In the first, changes in average 
length and variance for ecah P value were examined, while using the experimental E 
values. In the second, changes in average length and variance for each E value, while 
using selected P values were examined. All results are given in Table 9. For each E 
value and step, the average length and the variance can be seen. 

1. Evaluation of the First Dimension 

As mentioned earlier, the first dimension is the behavior of the average length 
and variance for each P value, while emploving experimental E values. The purpose is 
to understand the results as the P value is increased. 

This dimension is represented in Table 9, in rows, for each E value. The last 


row consists of mean average lengths and mean variances. 
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Limit Probabilities Step Number 


~] 
yn 


QO. 0004 
QO. Q009 
©; 0002 
QO. O006 
QO. O09 
O#2025 
0.025 


NIOULBONDH- 





Figure 3.9 Limit Probabilities At Each Step. 


TABLE 8 
DROPPED SYMBOLS AT EACH STEP 


Step Number Dropped Symbol 


CX ae 3 Ww 


Step 1 symbols and j ; ( ) 


Step 2 symbols and ” ° 
Step 3 symbols and f h 
Step 4 symbols and p v 
Step 5 symbols and z . , 
Step 6 symbols and c b g 





When the last row is examined, it can easily be seen that mean values tend to 
decrease as P increases. In other words, the more symbols that are dropped, the 
smaller average code length and variance reached. These last row values are given in 
Figure 3.10. The first row of Figure 3.10 gives the mean values without dropping any 
symbols. This is represented as step 0. Changes in the mean average length and mean 
variance, as the P value increases (each P corresponds to a step number) are given in 


Figure 3.11 and Figure 3.12, respectively. 


S| 


TABLE 9 
RESULTS OF EACH STEP FOR EXPERIMENTAL E VALUES 


E Stepl Step2 step3 Step4 Step5 Step6 step7 Mean 
0.0 4.2985 4.25824 4.25871 4.2088 4.1408 4.0135 3.825 4.163 
1.8441 1.72806 1.76745 1.52891 1.3175 0.9597 1.0069 ~ieaee 
0.0005 4.2985 4.2786 4.25887 4.20932 4.1413 4.01413 3.826 4.150 
1.8464 1.71537 1.76699 1.53131 1.3200 0.96142 1.70089 Sie 
0.001 4.2992 4.27963 4.25886 4.20932 4.1413 4.01413 3.826 4.147 
1.8464 1.71537 1.76699 1.53131 1.3200 0.96142 1.0089 1.451 
0.0045 4.3136 4.28641 4.27408 4.21096 4.1431 4.01413 3.8260 4.152 
1.4017 1.24142 1.15705 1.51672 1.2720 0.96142 1.0089 1.223 
0.0055 4.3092 4.27246 4.27246 4.22498 4.1431 4.01413 3.8327 4.156 
1.3282 1.92706 1.15441 1.31839 1.2709 0.96142 0.69510 aee 
0.0400 4.3751 4.36127 4.34814 4.35133 4.2715 4.07948 3.9243 4.244 
0.8299 0.76136 0.71132 0.60011 0.5438 0.48310 0.5410 08Gee 
0.0800 4.4912 4.40419 4.38820 4.36230 4.3152 4.24123 3.8904 4.299 
0.4708 0.60980 0.52578 0.36693 0.2803 0.21604 0.4923 0.423 
0.0750 4.5616 4.56811 4.57221 4.41920 4.4192 4.24135 4.1263 4.425 
0.3736 0.36446 0.35245 0.31971 0.3023 0.20952 0.1366 082 
0.1500 4.4903 4.47942 4.68289 4.42917 4.6583 4.31951 4.1463 4.458 
0.5091 0.46734 0.25246 0.31476 0.2289 0.22842 0.1366 0.305 
0.2500 5.0232 4.70216 4.87094 4.66697 4.5112 4.31960 4.3278 4.632 
0.0226 0.28173 0.12277 0.23258 0.2605 0.53337 70-2205 eee 
0.3000 5.0232 5.00763 4.86094 4.66697 4.3058 4.55952 4.3279, 4363 
0.0226 0.00758 0.12274 0.23258 0.3571 0.24646 0.2204 Onis 
Mean 4.4940 4.45002 4.45966 4.36087 4.2982 4.16619 4.08176 
0.9542 0.98360 0.88176 0.86303 0.7704 GO 61 T2ZaeGis2Ze25 


The general tendency is that the mean value decreases as the P value 


increases. On the other hand, some P values have an effect which is contrary to the 
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general tendency. For example, in step 3 (P = 0.000175) there is an increase in both 
mean values over those in step 2. Another one is that for E = 0.08000, in step 2 and 3 
(P2 = 0.0009 and P3 = 0.000179) the variances are larger than the variance of step 1. 

These experimental values are the results of the numbers’ nature in the 
reduction process of the Huffman coding. This is the reason why each source alphabet 
should be examined separately. Each has its own optimal P and E values. 

Additionally, it should be mentioned that an optimal E value for a specific P 
value might not be optimal for another P value, and vice versa, the P value which is 
optimal for any E value might not be optimal for another E value. For example, in 
Table 9, in step 2 (P = 0.0009), for E = 0.00400, the variance is 1.22708, which is 
smaller than the variance of step 1 for the same E. But, for the same P value (step 2), 
for E = 0.08000, the variance is 0.60980, which is larger than the variance of step 1 for 
the same E. 

Since the limit probability, P, has some effect on the meaning of the messages, 
a P value should first be chosen for a given source alphabet in a way that will not 
destroy the meaning of the messages. Then the optimal E value, which gives the 
optimal average length and variance in Huffman coding, should be chosen for the 
optimal P. 

The fourth chapter of this study, after examining the effect of these seven 


experimental P values on the Turkish messages, finds the optimal P. 


ODNRrWQONIN * 


4. Zz 
4. 2 
4. Z 
4, 6 
4. 7 
4. 5 
4, 2 
4. 6 


OOO0O00 OCF 





Figure 3.10 Mean Average Lengths And Mean Variances. 
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Average 


* 





Figure 3.11 Mean average Length vs. Steps. 


2. Evaluation of the Second Dimension 

The second dimension of the results involves the changes in average length 
and variance for each E value, while using the different P values. This dimension 1s 
represented by columns for each P value in Table 9. It shows us the changes in 
average length and variance for each experimental E values while employing each P 
value. The last column of Table 9 gives the mean values for each E value, mean 
average lengths, and mean variances. By examining the last column, we can see the 
behavior of the average length and variance for each E value, with the total effect of 
different P values. These last column values are given in Figure 3.13. The changes in 
mean average length with respect to different E values and in mean variance with 
respect to different E values are given in Figure 3.14 and Figure 3.15, respectively. 

The general tendency is for mean average length to increase as the E value 
increases and mean variance to decrease as the E value increases. For some 
exceptional values, the same comment can be made as was made previously. Thus, the 


optimal E value should be chosen for each P value separately. 
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Mean Variance 
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Mean Variance vs. Steps. 


Figure 3.12 
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Mean Average Lengths And Mean Variances For Each E. 


egure 3.13 
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Figure 3.14 Mean Average vs. E Values. 
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Variance 


+ 
0.120 OnLou 0.240 0.300 





Figure 3.15 Mean Variance vs. E Value. 
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IV. STATISTICAL EVALUATION OF THE DROPPING PROCESS 


A. THE DROPPING PROCESS 

In the previous chapter, the technical evaluation of the dropping process was 
discussed. In seven steps, different P values for each step (see Figure 3.9 ) were 
employed. The results of these decoding processes for each P value showed that, as P 
increased, the average code length and variance decreased. But the results still contain 
the “average code length vs. variance” trade - off. 

In this chapter, the main issue is to prevent the meaning of the message from any 
distortion during the dropping process. Although the more symbols dropped, the 
smaller average length and variance reached, in real life applications we cannot drop as 
many as we would like. At this point a limit probability (P) becomes the subject of 
discussion. 

To find the limit probability (P) for the given Turkish alphabet, four different 
short articles are examined using the Pascal language computer program in Appendix 
C. Each article is rewritten seven times using this program. In each step, experimental 
P values, which were given in Figure 3.9 are employed. The Pascal program does not 
rewrite the symbols which have lower probabilities than the P value. The original 


short articles are given in Appendix D. 


B. STATISTICAL EVALUATION FOR LIMIT PROBABILITY (P) 

After each step, the four short articles were read by Turkish officers attending 
N.P.S., in order to grade the meaning level of each article. Fifteen officers graded these 
articles, with grade ranges from 0 to 4. 0 means nothing is understandable and 4 
corresponds to the level that meaning is very clear. These grade numbers and 
corresponding meaning levels are given in Figure 4.1. 

The results of the survey showed that the meaning level up to the seventh step 
exhibited a slow decrease. These decreases stayed in the clear level. But, in the seventh 
step, it suddenly dropped into the very difficult region. Figure 4.2 gives the average 
grades for each article at every step. Figure 4.3 shows the resulting average meaning 
level of each step. The change in the meaning level wiile the limit probability (P) 


increases can be examined in Figure 4.4. 
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Grade Number Meaning Level 


Nothing is understood. 


Very diffucult. 

Daren cule. 

Meaning is clear. 
Meaning is very clear. 





Figure 4.1 Grade Classification. 


Article No Sl 


Total Grade: 16 15.08 14.82 13.07 





Figure 4.2 Grades For Each Article At Every Step. 


Step Number Hl 


Meaning Level 4 





Figure 4.3 Average Meaning Levels At Each Step. 


C. OPTIMAL LIMIT PROBABILITY 

Our purpose 1s to choose the optimal probability for the given Turkish alphabet. 
This optimal limit probability is supposed to give a decrease in average length and 
variance, while remaining in the clear level. This logic leads us, by examining Figure 


4.4, to choose step 6 probability as the optimal one. The optimal limit probablity, 


So 


used in step 6, is 0.015. The rewritten forms of the articles at step 6 are given in 
Appendix E. 

In Chapter 3, Table 9 gives the average lengths and variances for each step (P 
value) and for each E value. The average lengths and variance for P = 0.015 and 
corresponding E values are given in Figure 4.5. The code words which are the results 
of the experimental E values and P = 0.015 are given in Table 10. 

The trade - offs between average length and variance, after the dropping process, 
are given in Figure 4.6. The same conclusion found in Chapter 2 could be reached, 
namely a decrease in variance requires an increase in the average length. 

The modified Huffman coding results, average length and variance values for 
experimental E values, without dropping any symbol (given in Figure 3.6 ) and the 
values after dropping the symbols which have lower probabilities than 0.015 are 
compared. 

The average length and variance differences between the encoding processes, 
before and after dropping the symbols for P = 0.015, are given in Figure 4.7. In this 
figure, the positive values show the decreases and the negative values show the 
increases in the results after dropping. 

Generally, a decrease can be seen in both average length and in variance. But for 
E values of 0.2500 and 0.3000, the variances after the dropping process increased, bv 
0.36523 and 0.16585, respectively. Figure 4.8 shows the change in average lengths 
before and after dropping while E increases. It can be seen from Figure 4.9 that, as the 
E value increases, the difference between average lengths, before and after dropping, 
also increases. Further, E = 0.25 and larger values, the variance after dropping 
becomes larger than the variance before dropping. 

We want to decrease the variance, while experiencing some increase in the 
average length as a benefit of the dropping process. Hence, in Figure 4.7 the minimum 
increase in average length and maximum decrease in variance values lead us to our 


objective. 
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TABLE 10 
CODE WORDS AFTER DROPPING THE LESS FREQUENT SYMBOLS 


Symbol Code Word Symbol Code Word 


Space 
i 


PFOOLTIIO1 
Pe TO 
OigitiiOl 
POCO TOL 
000011101 
001011101 
102 ODE OL 
OEE LOr 
PET erOr 


IDO LPWOWUNHOAWAO 


E 
N 
R 
U 
L 
S 
K 
D 
iE 
M 
ie 


E = 0.00 
Average = 4.01359 
Variance = 0.95975 
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TABLE 10 
CODE WORDS AFTER DROPPING THE LESS FREQUENT SYMBOLS 


Symbol Code Word Symbol Code Word 


00101 
01111 

att aka 
010101 
000110101 
010110101 
001110101 
1OilT Oiler 
11i1iorGe 
O1T11GTGF 
0100110101 
1100110101 
0110110101 
1110110101 


Space 
A 


E 
N 
R 
U 
L 
S 
K 
D 
a 
M 
x 


AIDW POWUNHONWAO 


Pe 


E = 0.0005 
Average = 4.01413 
Variance = 0.96142 
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TABLE 10 
CODE WORDS AFTER DROPPING THE LESS FREQUENT SYMBOLS 


Symbol Code Word Symbol Code Word 


Space 010 
001 
Oli 
0000 


00101 
Oli 
Tit 
010101 
100110101 
POL EOLe1 
OORT TOReL 
LOELLO LO 
eee rored 
OTE OLOL 
0000110101 
1000110101 
0010110101 
1010110101 


WOO AOWUNHONWAO 


A 
E 
N 
R 
U 
L 
S 
K 
D 
1 
M 
i 


E = 0.0010 
Average = 4.01413 
Variance = 0.96142 
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TABLE 10 
CODE WORDS AFTER DROPPING THE LESS FREQUENT SYMBOLS 


Symbol Code Word Symbol Code Word 


ace 


10001 
OTOL 
1100 
OLLIE 
010111110 
11017 
00111 Peis 
101 Tae 
Lili pris 
OlLiT rere 
0000111110 
1000111110 
01007 Tis 
1100TTIT8s 


P 
I 
A 
E 
N 
R 
U 
L 
S 
K 
D 
5 
M 
4 


ARDOPOWMUNHONWOAO 


00001 


E = 0.0045 
Average = 4.01413 
Variance = 0.96142 
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TABLE 10 
CODE WORDS AFTER DROPPING THE LESS FREQUENT SYMBOLS 


Symbol Code Word Symbol Code Word 


10001 
01011 
LTOe 
001110 
010101110 
110101110 
001101110 
101101110 
LE Por 10 
OPT © 
0000101110 
1000101110 
0100101110 
1100101110 


"Page 


ty 


AIKDOHLOWUNHONMWAO 


N 
R 
U 
L 
S 
K 
D 
5 
M 
x 


E = 0.055 
Average = 4.01413 
Variance = 0.96142 
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TABLE 10 


CODE WORDS AFTER DROPPING THE LESS FREQUENT SYMBOLS 


Symbol Code Word 


Space 101 
i 


E 
N 
R 
U 
L 
S 
K 
D 
£ 
M 
x 


E = 0.040 
Average = 4.07994 
Variance =0.48310 
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Symbol 


IDO POWUNHOAWAO 


Code Word 


00010 
10010 
00110 
101101 
01011000 
11011000 
00111000 
10111000 
11111000 
01111000 
000011000 
100011000 
010011000 
110011000 





TABLE 10 
CODE WORDS AFTER DROPPING THE LESS FREQUENT SYMBOLS 


Symbol Code Word Symbol Code Word 


01100 
PELOO 
00010 
10010 
00001 
10001 
0001110 
POO ET TO 
0101110 
PaO 1110 
POMEL O 
HOTT TO 
OPEL O 
ree 20 


ce 


pa 
I 
A 
E 
N 
R 
U 
L 
S 
K 
D 
a. 
M 
24 


ANDO POWUNHOANWHAO 


E = 0.080 
Average = 4.24123 
Varlance = 0.21604 
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TABLE 10 
CODE WORDS AFTER DROPPING THE LESS FREQUENT SYMBOLS 


Symbol Code Word Symbol Code Word 


Space 01100 


OLTODELE 
1107 Ere 


AIKHOHPOWUNHOAWAO 


A 
E 
N 
R 
U 
L 
S 
K 
D 
aT 
M 
x 


E = 0.075 
Average = 4.14135 
Variance = 0.20952 
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TABLE 10 
CODE WORDS AFTER DROPPING THE LESS FREQUENT SYMBOLS 


Symbol Code Word Symbol Code Word 


01100 
11100 
00100 
10110 
01110 
EEE O 
000101 
100101 
110101 
010101 
001101 
101101 
011101 
ib, Bilal)! 


oo 


IDO POWMNNHOAWAO 


A 
E 
N 
R 
U 
L 
5 
K 
D 
i, 
M 
fe 


10100 


E= 0.15 
Average = 4.31951 
Variance = 0.22842 
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TABLE 10 
CODE WORDS AFTER DROPPING THE LESS FREQUENT SYMBOLS 


Symbol Code Word Symbol Code Word 


010010 
110010 
001010 
1O10re 
011010 
Litoro 
000110 
100110 
LTOt¢e 
OTOTEe 
001110 
LOPES 
OT ere 
TiTEre 


Space 
i 


UDO POWUNHOAWAO 


E 
N 
R 
U 
L 
S 
K 
D 
a 
M 
Y 


100010 


En = 0.25 
Average = 4.31693 
Variance = 0.53337 
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TABLE 10 
CODE WORDS AFTER DROPPING THE LESS FREQUENT SYMBOLS 


Symbol Code Word Symbol Code Word 


01010 
11010 
00110 
10110 
01110 
PEL Eo 
OO00O11 
10011 
Peon 
01011 
OOl111 
LOTTI 
Ole t 
Pel 


Space 
} 


00010 
10010 


ADO HOWUNHOOAWAO 


E 
N 
R 
U 
L 
S 
K 
D 
T 
M 
i 


E = 0.30 
Average = 4.55952 
Variance = 0.24645 
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Figure 4.5 Average Lengths And Variance At Step 6. 
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Figure 4.6 Average Length vs. Variance At Step 6. 
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Figure 4.7 Average Length And Variance Differences For C10 And C11. 
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Average Difference 





Figure 4.8 Average Length Differences vs. E Values. 
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Figure 4.9 Variance Differences vs. E Value. 
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V. REDUCTION IN BANDWIDTH 


A. QUEUEING THEORY IN A COMMUNICATIONS SYSTEM 

In communications systems, as with other systems, the managerial view is to use 
available sources effectively. The limited availability of the frequencies, according to 
which transmission bandwidths are determined, makes the managerial job harder. The 
main idea is to transmit in a way that uses the minimum required bandwidth. This 
adjustment is done by employing the output rates which satisfy the objective. 

Our communications system model has its own characteristics. The input rate of 
the system is the number of digits which comes to the transmitter in a unit time. This 
number of digits is the combination of the 0’s and l’s. The frequency distribution of 
arrivals is similar to the theoretical Poisson distribution. As described in [Ref. 8,] the 
Poisson distribution occurs when arrivals are random in a given period of time (unit 
time). This means that although we know the mean arrival rate for unit time, the exact 
arrival can not be predicted at any given moment. Input rate is represented by lambda 
(i). 

The output rate is the number of digits which are transmitted in a given period of 
time (unit time). The inverse of the output rate is described as the output time and has 
the negative exponential distribution. The output rate is represented with mu (1). 

In our model, the input rate is diretly affected by the symbols which comprise the 
messages and are intended to be sent through our communications system. Since each 
symbol has its own digital representation after the encoding process (Modified 
Huffman coding), the number of digits which enter the svstem varies. In other words, 
the number of digits has the lowest and the peak values. 

As a system manager, we can adjust our output rate in two ways. The first way 
is to have an output rate as high as the peak value of the input. But, since the peak 
value doesn’t always occur, the resources will be wasted. The second option is to set 
the output rate close to the mean input rate and then store the excess digits in a buffer. 
Having a buffer allows us to reduce the output rate. The reduction in the output rate 
also means having a gain in the transmission bandwidth. These two together allow 


management to achieve effective usage of the resources. 
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At this point, the size of the buffer should be considered. The size of the buffer 
directly affects the efficiency of the system. Gains in bandwidth should result in buffer 
size increase, i.e. having a very large buffer. 

Every digit is transmitted according to an arrival sequence. The first digit 
arriving is always the first digit transmmitted, and so on. This technique is called first - 
in - first - out (FIFO.) This guarantees that the sequence, the meaning of the symbols 
and the meaning of the messages are not destroyed. 

Another characteristic of the system is that it has just one transmitter. Every 
digit which comes into the system is transmitted through one channel. If the output 
rate is smaller than the input rate, the buffer size will increase without bound. So, the 
input rate should be less than or equal to the output rate (A <= pL.) 

In summary, our communications system model can be represented as M / M / 1. 
This representation means, single transmitter (1), Poisson arrivals (M), and exponential 
output (M.) 


B. SIMULATION OF THE COMMUNICATIONS SYSTEM 
1. Simulation Without Dropping Process 

The communications system has been simulated with the computer programs 
written in Pascal. The first program, given in Appendix F, is the simulation of the 
communications system which encodes the Turkish alphabet symbols without dropping 
any of them. This program was run nine times with nine different codes. Eight of 
these codes were chosen among the eleven codes which were given in Table 6. For the 
ninth run, the “block code” was used. They are given in Figure 5.1. 

The output of the program is the maximum buffer size which is required for 
transmitting the first 200 characters of the article given in Appendix A. The output 
rates are chosen arbitraily and they are 4.01359, 4.2, 4.4, 4.6, 4.8, 5.1, 5.5, and 6.0. The 
maximum output rate, 6.0, is the rate that is required to transmit the block codes 
without any buffer. 

The output rates which are lower than 6.0 represent the gain in the 
bandwidth. The required maximum buffer sizes for each code are given in Figure 2.1 
and each output rate are given in Table II. The gains in bandwidth at each output 
level are given in Figure 5.2. 

When Table 11 is examined the following results can be concluded: 

(1) As the gain in bandwidth increases (lower output rates), the 


maximum required buffer size also increases. 
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(2) Modification of Huffman coding, (increase in average code 
length and decrease in variance) gives us a lower buffer 
size. (This conclusion is true when dX < 4.) 


Ave. Len. (Input R. ) Code Name 
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Figure 5.1 Code Lengths And Code Names. 
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Figure 5.2 Percentage Gain In Bandwidth For Each Output Rate. 


2. Simulation With Dropping Process » 

The second experiment is the simulation of the communications system which 
encodes the Turkish alphabet after dropping the same source symbols which have 
probabilities less than the optimal limit probability (P = 0.015.) This program was 
also run nine times, applying nine different codes. Eight of these codes are chosen 


among the eleven codes which were given in Table 11. Again, block coding is used as 
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TABLE 11 
MAXIMUM REQUIRED BUFFER SIZES WITHOUT DROPPING 


Input Rate | Output Rate | Code 
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the ninth code. These nine codes’ average lengths and the code names are given in 
Figure 5.3. The computer program is given in Appendix G. 

The output of the system is the same as the first program. The maximum 
required buffer size for the first 200 characters of the article is given in Appendix A. 
The same output rates which are given in Figure 5.2 were used. 

The maximum buffer size requirements for each code, at eight different output 
rates are given in Table 12. After examining Table 12, we can reach the same 
conclusion as we did in the first section, that is: higher average length results in 
smaller buffer size. 

We must compare the two tables ( Table 11 and Table 12 ) in order to find 
out the effect of the dropping process in bandwidth and buffer size. In order to 
compare before and after dropping encoding processes easily, the output rates are 
chosen to be exactly the same in both models. The minimum value of the output rate 
is the minimum average length value, reached after the dropping process. It is named 
“code A2.” It is actually the Huffman coding process result after dropping, because the 
applied E value is 0.0. This output rate gives a 33.11% gain in bandwidth when 
compared with the output rate (6.0). 

It can easily be seen that the buffer sizes which are the results of the modified 


Huffman coding process after dropping the less frequent symbols are much smaller 


We 


than the before (without) dropping process. For example, let’s compare buffer sizes 
which are the results of codes Dl and D2. Both DI and D2 used the same 
modification parameter E = 0.08. Code D1 was done without dropping and code D2 
was done after dropping. Since the buffer sizes are reached by employing the same 
output rates, both D1] and D2 have the same percentage bandwidth gain for each 
output rate. This results in a reduction/gain in buffer size in addition to a gain in 
bandwidth. The buffer sizes and the percentage gains of Code D2 are given in Figure 
5.4. 

By looking at the percentage buffer size gains of Code D2 at each output 
level, we can easily see the positive effects of the dropping process on the buffer size. 
Between Code D1 and Code D2, the mean percentage gain 1s 75.35%. 

Hence, it can easily be concluded that besides a maximum 33.11% reduction 
in bandwidth, the dropping process can give us an average of 75.35% reduction in 
buffer size. The change in buffer size for our sample codes D1 and D2, while 
increasing the output rate (decreasing the bandwidth gain) can be seen in the graph 
given in Figure 5.5. The vertical axis is the maximum required buffer size. The 
horizontal axis is the output rate, with the bandwidth gain given in parentheses. The 


area between curves D1 and D2 gives the buffer size gain of Code D2. 
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TABLE 12 
MAXIMUM REQUIRED BUFFER SIZES WITH DROPPING 


Input Rate | Output Rate 
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Figure 5.3 Code Lengths And The Code Names. 
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Figure 5.4 Buffer Size Gain Of Code D2. 
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Figure 5.5 Max. Buffer Size of The Codess®ieAndso 
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VI. TWO ALTERNATIVE APPROACHES 


A. DROPPING MORE FREQUENT SYMBOLS 

During my experiments, another approach to the dropping process appeared 
logical, namely to drop the more frequent symbols instead of the less frequent symbols. 
The theoretical explanation and an example of this experiment are given below. 

Since the main idea is to decrease the variance of the code words, the smaller 
variance can be reached mathematically, not just by avoiding longer length code words, 
as explained in section 2.3, but also by avoiding larger symbol probabilities. If we 
recall the variance formula: 

V = YP (X,-L,..)° 


the smaller P.’s are, the smaller the variance 1s. 


ave 


When we apply this explanation in our coding process, the purpose is to leave 
the higher probability symbols out of the calculation. In other words, drop them 
before encoding. 

The average length and variance results after dropping the more frequent 
symbols for the same example given in Figure 3.3 is shown below. 

L = 0.7 


V = 0.716 


When we encode the same sample source alphabet, after dropping the more 


frequent symbols, by using Modified Huffman coding for E = 0.2, the results are 
changed. 
L = 0.8 
V = 0.576 


Average lengths and variances for four different encoding processes are 
Summarized in Table 13. When we examine columns two and four, it can be said that 
dropping the more frequent symbol has a greater effect on average length than it has 
on variance. Although encoding after dropping the more frequent symbols results in a 
high decrease in average length, it gives an increase in variance if we compare with 
Modified Huffman coding variance (0.24.) But, since the decrease in average length is 


very high, we can still apply this idea in our sample messages. 
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Figure 6.1 Dropping Process. 
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Figure 6.2 Huffman Coding After Dropping. 
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Figure 6.3 Final Code Words. 
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Figure 6.4 Modified Huffman Coding After Dropping for E=0.2. 
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Figure 6.5 Final Code Words For E=0.2. 
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B. DROPPING MORE FREQUENT SYMBOLS FROM TURKISH ALPHABET 

The idea mentioned in the prior section is applied to the sample articles which 
were given in Appendix D. These four sample articles are rewritten several times. At 
the first step, the second most frequent symbol, I, 1s dropped. Then, at each step one 
more symbol, in decreasing frequency order, is dropped. At the seventh step, a total of 
seven symbols (I, A, E, N, R, U, L) are dropped. 

After this last step, some other symbol combinations, which were chosen among 
the more frequent ones, are dropped. The step numbers and the dropped symbols on 
each step are given in Table 14. 

Examination of the rewritten articles showed that the meaning of the articles 
were dramatically destroyed at every step after the first one. 

According to the author’s observation, dropping more than one of the vowels 
affected the meaning level significantly. So, for the sake of experiment, these articles 
were rewritten by dropping one vowel symbol and one or more consonant symbols. 
The symbol which was used at the first step, I, is chosen as the vowel symbol. 

First, the (I, N) combination, followed by the (I, N, R) combinations were 
dropped. The desired meaning level was reached with the first set. 

After choosing the symbols which can be dropped without destroying the 
meaning of the articles, the Modified Huffman coding process is applied. The resulting 
average lengths and variances for each experimental E value are given in Figure 6.6. 
The articles after dropping I and N are given in Appendix H. 

The Huffman encoding without modification (E = 0.0) and without dropping 
any symbols, gives an average length of 4.30771 and a variance of 1.9182 ( Figure 3.6. 
) When we compare these values with Figure 6.6 values, it can easily be seen that 
dropping more frequent symbols (I and N) gives us a reduction in both average code 
length and variance. Hence, dropping the more frequent symbols 1s also a solution to 


reducing bandwidth and buffer size. 
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TABLE 14 
DROPPED SYMBOLS AT EACH STEP 
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Figure 6.6 Average Lengths And Variances After Dropping I And N. 
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C. DROPPING “MORE AND LESS” FREQUENT SYMBOLS TOGETHER 

The positive effect of dropping less frequent symbols on the bandwidth and the 
buffer size is shown in Chapters 3 and 5. The same positive effect of dropping more 
frequent symbols is shown in the previous section. Since both approaches yield the 
desired result separately, could they give the same desired result when applied together? 
In other words, if we drop some more frequent symbols and some less frequent 
symbols, what would the result be? 

To answer this question, at each step in addition to the more frequent symbols 
examined in the last section (I and N), some less frequent symbols were dropped and 
the same articles were rewritten. Figure 6.7 gives the step numbers and the 
corresponding symbols. 

The meaning level of the articles is destroyed after the fourth step. The more and 
less frequent symbols combination which can be dropped without destroying the 
meanine is (I, N, Q, X, ?, -,:, W, J,;3,(, ), ", 5 ,) The modified Huffman encoding 
process is applied after dropping this more and less frequent symbols combination. 
The resulting average lengths and variances for each experimental E value are given in 
Figure 6.8. The rewritten forms of the articles are given in Appendix I. 

This process not only gave smaller average lengths and variances than Huffman 
coding, but also gave smaller values than dropping more frequent symbols ( Figure 
6.6.) Hence the gain (reduction in the bandwidth and the buffer size) is more than the 
gain of the Huffman coding, modified Huffman coding and modified Huffman coding 
after dropping more frequent symbols. 

It should be mentioned here that the optimal symbol combination which can be 
dropped without destroying the meaning level of any message is the subject of a more 
detailed research. [t, of course, changes from alphabet to alphabet and also changes 
from field to field. For example, this combination might be different in the military 
than in chemistry or some other field. Also, the frequent usage of the same words and 
phrases in one field might let users of this field drop more symbols for communicating 
within this field. 
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Figure 6.8 Average Lengths And Variances After Dropping. 
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VII. EVALUATION OF THE RESULTS AND CONCLUSIONS 


A. EVALUATION OF THE RESULTS 

As stated earlier, the main objective of this research is to reduce the bandwidth, 
in other words the transmission bit rate, and the buffer size. The technique used is to 
drop the less frequent source symbols before encoding and then to encode the message 
by using Modified Huffman Coding. 

The results which were reached in Chapter 4 gave a 33.11% reduction in 
bandwidth and a 75.35% reduction in buffer size, for the first 200 symbols of the given 
message in Appendix A. The gain calculation was done by comparing the results of 
encoding after dropping less frequent symbols with the results of the block coding 
process. 

During experiments for finding the optimal modification parameter (E) after the 
dropping process, it was concluded that a change in the number of symbols in the 
source alphabet affected the optimal modification parameter. In other words, a 
modification parameter which is optimal for a given source alphabet is not necessarily 
optimal for the same alphabet after the dropping process. This conclusion leads us to 
calculate the optimal modification parameter individually for each separate source 
alphabet with a given number of symbols. 

In Chapter 6, two additional approaches were briefly examined. The first is to 
encode the message after dropping the more frequent symbols by using Modified 
Huffman Coding. The second one is to encode after dropping more and less frequent 
symbols combination. In order to show the effect of these last two approaches on the 
average code lengths and variances, let’s choose a modification parameter and compare 
the resulting values of these parameters with the Huffman coding results without 
modification and dropping. 

In Chapter 3, for E = 0.04 the average code length and the variance without 
dropping any symbols were calculated to be 4.38381 and 0.89210 ( Figure 3.6. ) After 
dropping the less frequent symbols, for the same parameter the results are 4.07948 and 
0.048310 ( Figure 4.5. ) Finally, in Chapter 6, after dropping the more frequent 
symbols the results are 4.3479 and 1.79454 ( Figure 6.6. ) And, from Figure 6.8, the 


average code length and variance after dropping the more and less frequent symbol 
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combination are 4.26919 and 0.58460. If we recall the Huffman coding results, without 
modification and without dropping, from Chapter 3, Figure 3.6, the average length is 
4.30731 and variance is 1.91820. Figure 7.1 shows these values together. We rearrance 
Figure 7.1 in decreasing variance order, as shown in Figure 7.2. Figure 7.3 shows the 
same code results in decreasing average length order. 

As explained earlier, longer average length means a larger bandwidth is required. 
When Figure 7.3 is examined, it can be seen that the dropping processes decrease the 
average length. Although the fourth code has a larger average length than Huffman 
coding, it is smaller than modified Huffman coding average length. This is the effect of 
the modification process on the average length. The modification process, without 
dropping any symbol, while decreasing the variance, increases the average length. 

At this point, when Figure 7.2 is examined, it 1s seen that this negative effect of 
the modification process (increase in average length, while having a decrease in 
variance) is eliminated by using codes five and three. These two, encoding after 
dropping more and less frequent symbols and after dropping less frequent symbols by 
using Modified Huffman Coding, not only decrease the variance but also decrease the 


average length as well. 


Code No. Code Name Average L. Variance 
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Figure 7.1 Results Of Five Different Coding For E=0.04. 
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Figure 7.2 Results In Decreasing Variance Order. 
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Figure 7.3. Results In Decreasing Average Length Order. 
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B. CONCLUSION 3 = 

We have shown that employing Modified Huffman coding after dropping the less 
frequent or more and less frequent source symbols combination results in a decrease in 
variance as Well as in average code length. 

A decrease in average length reduces the number of digits transmitted in a unit 
time in a communications system. Hence, this communications system can either 
handle the same amount of traffic with less transmission bandwidth, and share the 
excess capacity with others, or with the same available bandwith transmit a greater 
traffic load. In both cases, the required buffer size, due to a dramatic reduction in the 
variance, will be very small. This reduction in buffer size results in a cost savings as 
well as reduces the need for complex network flow control algorithms. 

In addition to these benefits of the dropping process, the Modified Huffman 
coding technique can be used for encryption (since each E value results in a unique set 
of code words.) The modification parameter E can be considered as an encryption key 
and distributed for each encryption period to the stations for decryption. This presents 
a subject for future research. 
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APPENDIX A 
THE TURKISH MAGAZINE ARTICLES 


1. “STRANGE SHAPES OF MODERN SHIPS” 

The first article titled “Strange Shapes of Modern Ships” is given below. 

Bir derginin ressami, en guclu vinclerin yapamadigi 1si basararak, 50.000 tonluk 
bir “olyanus devi’n1 Sudan cikardi ve boylece, geminin burnundaki yumrubas “balb” 
ortaya cikmis oldu. Geminin kic tarafinda da bazi yenilikler goze carpiyordu. Bunlarin 
sirri acaba ne olabilirdi? Otomobil yapimcilarinin yeni gelistirdikleri modelleri 
denedikler1 “ruzgar tunelleri’nin bir benzeri deniz teknelert uzerinde calisan 
meslektaslari icin de gecerli oluyor. Onlarin da yeni tekne modelleri denedikleri “test 
havuzlari” var. Yeni gemiler, ancak, bu havuzlarda yapilan deneylerin olumlu sonuclar 
vermesinden sonra, insa edilmek uzere kizaga konuyor. Bu _ arada, gemi 
muhendislerinin isleri, kara araclari1 uzerinde ugras veren meslektaslarinin islerinden 
biraz daha guc. Bu gucluk, daha model asamasinda baslar. Deneyleri yapilan gemi 
modelleri, yeterince buyuk oldugu zaman, deneylerden alinan olcum sonuclari, 
istenileniverebilmektedir. Guclugu yaratan ikinci etken de, dunyamizin “su” ve “hava” 
olarak bilinen iki elamamindan kaynaklanmaktadir. Bir kara tasitinda, laroseri sadece 
ruzgara karsi koymak zorunda olmasina karsin, bir teknenin hem dalgava ve hem de, 
ruzgara karsi koymasi gerekir. ski tarthlerde insa edilmis gemulerde, burunlar 
keskinlestirilir ve bovlece suvun daha az bir direnimle yarilmasi saglanirdi. Ancak, bu 
is, aslinda hic de gorundugu kadar basit degildir. Gem hesaplari, sualtindan ateslenen 
bir roketin hesaplarindan daha karmasik ve gictur. Buiraz once belirtigimiz gibi bir 
gemu, su ve hava ortanunda seyreder. Bu nedenle de, ozellikle havanin ve suvun 
birlestigi nokta, muhendisler icin bir “bilmece’dir. Beney havuzlarindan alinan sonuclar 
okyanuslar icin de gecerli oldugundan; bubenzer iliskilerden yararlanan gemi m 
uhendisleri, deneylerini deney havuzlarinda yapmaktadirlar. Gemuve hareket veren 
pervane, tekneyi ileriye iterken, geminin burnunda bir dalga olusur. Bu dalga, burunda, 
yanlarda, dipte ve kicta gemiyi yalayarak gecer. Ancak, anilan dalga alisilagelen tipte 
bir dalga olmayip, saga-sola karisik hareketler yapan sular halindedir. Genu burnunda 
olusan ve tekne tarafindan iletilen bu su kitleleri, gemi burnunun genisligi oraninda 
artan bir yigilma yaparak, istenilmeyen bir direnc olusturur (sekil 1). Istenilmeyen bu 


direncin etkisini azaltabilmek icin, geminin burnunda yumrubas denilen ve mahmuzu 
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andiran bir cikinti yapilir. Yumrubasin etkisi soyle aciklanabilir: yumrubasli bir tekne, 
onunde iki dalga tepesi olusturur. Bunlardan, teknenin olusturdugu dalga tepesi, 
yumrubasin olusturdugu dalganin cukurunun doldurarak, gemi burnundaki yigilmayi 
onler (sekil 2). Donuc olarak da, istenilmeyen dalga yok edilir. Yumrubas adi verilen 
bu yeni burun tipi, Amerikali gemi adami David Taylor’un bulusudur. Yuzyilimizin 
baslarinda Taylor, yumrubasli gemilerin, digerlerine kiyasla daha kucuk dalgalar 
olusturdugunu tespit etmis ve bunun teorisi saha sonra gelistirilmistir. Ancak, tum 
olasiliklari aydinliga kavusturacak kesin formuller gunumuzde dahi tam olarak 
saptanmis degildir. Yumrubas teorisinin gelismesini asagidaki maddelerle acikliyabiliriz: 
(1) seyir halindeki bir gemi, onunde buyuk bir dalga tepesi olusturarak ilerler. (2) su 
yuzeuinin hemen altinda hareket ettirlen bir kure, arkasinda bir dalga cukuru 
olusturur. (3) gemi modelinin burnuna bir kure yerlestirilerek, kurenin olusturdugu 
dalga cukuru ile gemi modelinin olusturdugu dalgayi cakistiracak bir deney uygulamasi 
gerceklestirilir. (4) deneyde, dalga cukurunun dalga tepesini yuttugu gorulur. (5) dalga 
tepesi yutuldugundan; istenilmeyen direnc etkisini kaybeder. Sonuc olarak, gemi 
modeli daha buyuk bir hiz kazanir veya hareketi icin gerekli olan guc azalir. Alinan bu 
sonuc, geminin tukettigi yakitta hic de azimsanmavacak bir tasarruf saglandigini ortaya 
koyar. Armatorlerin yumrubasli gemi siparislerine agirlik vermelerinden sonra, 
muhendislerin isleri daha da guclesmistir. I1k zamanlarda yumrubaslar, yolcu ve savas 
gemilerinde uygulaniyordu. Bununda nedeni, anilan gemiulerin seferlerini genellikle sabit 
bir su kesiminde yapmalari idi. Ovysa, armatorun siparise bagladigi yuk gemilerinde su 
kesimi (draft), gemilerin yuklu veya bos olmalarina gore, degisebildigi icin, gemi 
burnunda yer alal yumrubas, etkinlik pozisvonunu koruyamamaktadir. Gemi, yukunu 
alarak sefere ciktiginda; yumrubas, sualtinda, kalarak, etkinligini surdurmekte ise de, 
yukun bosaltilmasindan sonra, su yuzeyine cikmakta ve sonuc olarak, etkinligini 
kaybetmektedir. Bu durum, yumrubasin gemi burnunda nerede ver almasi gerektigi 
sorununu ortaya cikarmistir. Daha sonra, yumrubas, gemi burnunun biraz daha 
asagisina alinarak, suyun altinda birakilmis ve istenilen sonuca kismen de olsa 
ulasilmistir. Yumrubasi sadece  sualtinda  birakmakla  sorunlara  cozum 
getirilememektedir. Cunku, her tekne kendine ozgu bir dalga sekli olusturmakta be bu 
nedenle de, yumrubasin, kullanacagi tekne ile uvum saglayacak ozelliklere sahip olniasi 
gerekmektedir. Gemi muhendislerinin goguslemek zorunda olduklari bu guclukler, 
yenbi arastirma alanlarinin dogmasina yol acmis ve bu kez de, arastirmalar geminin kic 


tarafinda yogunlasmistir. Yaklasik 20 yil kadar once, Hamburglu gemi muhendisi ernst 
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nonnecke, yeni bir kic formu gelistirmis ise de, onun bu bulusu ancak son yillarda deger 
kazanmaga ve dikkat cekmege baslamistir. Nitekim, nonnecke’nin bulusu, bir kore 
tersanesinde 2 konteyner gemisinde uygulamaya konulmustur. Teorik calismalar 
Hamburg’da baslamis ve bunu izleyen deneylerde, insa edilecek geminin bir modeli, 
boyu 300 M. Ve derinligi 18 M. olan bir deney havuzuna cekilerek, nonnecke’nin 
gelistirdigi kic formunun ustunlugu kabul edilmistir. Bu tip asimetrik kic formu: 
sancak tarafi cukur ve iskele tarafi disa dogru bombelidir. Bu formun ozelligi, suyun 
akisini duzelterek, dogrudan pervaneye vermesidir. Nonnecke tipi kic teorisi su sekilde 
aciklanabilir: sivi icinde hareket eden bir govde, suyu bas taraftan yarar. Yarilan su, 
govdenin kic tarafinda yine birlesmek egilimi gosterirken, bu kez de geminin pervanesi 
ile karsilar. Geminin hareket yonune gore, saga dogru donen pervane, suyu teKnenin 
sancak (sag) tarafindan asagiya iter, buna karsin, iskele tarafindan (sol), yukariya dogru 
itilerek, teknenin kic tarafinda birleseme egilimi gosteren su, birlesmeden pervanenin 
akimina kapilir. Cekilen sualti fotograflari ile tespit edilen bu olay, suyun gemide iskele 
tarafindan gerektirdigi itici gucu olusturmadan, yukartya dogru itildigi gercegini ortaya 
koymustur. Bu olay uzerinde duran nonnecke, iskele tarafindan pervaneye yonelen su 
akisini duzenleyebilmek icin gemide sancak be iskele taraflarinin pervaneye yakin olan 
kisimlarinda, tasarladigi form degisikliklerini gerceklestirmistir. Buna gore, geminin 
sancak tarafi cukurlastirilmis; iskele tarafinda ise, cukurlugun yerini yumusak bir 
bombe almustir (sekil 5). Sonuc olarak,suyun dagilmaksizin ve _ turbulansa 
ugramaksizin, pervaneve akabilmesi saglanmistir (sekil 3 ve 5) eski ve yeni tip iki 
geminin en kesit egrilerini vermektedir. Eski tip bir gemide en kesit egrileri simetrik bir 
bicim gostermekte ve geminin ortasinda duz bir cizgi boyunca birlesmektedir (sekil 3). 
Diger tip kic formunda ise, anilan egriler a simetrik olarak gelmekte ve genunin 
ortasinda “S” sekilindeki bi cizgi uzerinde toplanmaktadir (sekil 5). Sekil 4 ve 6’da, eski 
ve yeni tip kic formlarinin birer profili ile pervaneye dogru vyonelen suvun akisi 
gorulmektedir. Eski tip kic formunda (sekil 4); pervaneye dogru akis yapan su, pervane 
ile Karsilastiginda turbulansa ugramakta ve dolayli olarak da, gemi dieselinin pervaneve 
aktardigi gutce kayba vol acmaktadir. Nonnecke tipi kic formunda ise, pervaneve 
yonelen suyun akisi duzenlenmis (sekil 6) ve duzenlenen su, turbulansa ugramadan, 
pervane tarafindan itilerek, pervanenin verimi artirilmis ve geminin daha az bir gucle 
daha buyuk bir hiz kazanmasi saglanmistir. “Thea S” adli 124 metrelik gemude yapilan 
deneyler, bu yeni kic formunun gunde 2.000 litrelik bir yakit tasarrufu sagladigini 


Ortaya koymustur. Eski tip gemi formlarinin gecerli oldugu gunlere kivasla, vakit 
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fiatlarinin bugun 10 kat arttigi goz onunde tutulursa, gemilere saglanlan yakit 
tasarrufunun ne kadar onemli oldugu ve modern gemilerinin nicin boyle garip 
bicimlerde insa edildigi sorusu kendiliginden aydinliga kavusabilir. 


2, “STORY OF THE SPACE SHUTTLE 

The second magazine article is titled “Story of the Space Shuttle” and is given 
below. 

1970’lere dek dayanan uzay mekigi projesinin temel amaci, uzaya daha ucuz ve 
dolayisiyla daha sik gitmektir. Mekikten once uzaya atilan insanli ve insansiz uydular, 
sonda ve roketler sadece bir kez kullanilabiliyordu ve bu nedenle maliyetleri yuksek 
oluyordu. Uzay mekigi projesi ile insanoglu, ayni uzay_ aracini surekli kullanma 
olanigina kavustu. Bu projenin en belirgin ozelligi ucak teknolojisi ile uzay 
teknolojisini bir araya getirmesidir. Sistem genelde uc ana bolumden olusmaktadir: (1) 
yorunge araci da denen uzay gemisinin kendisi; (2) buyuk dis yakit tanki; (3) dis yakit 
tankinin her iki tarafinda bulunan kati yakitli roketler. Sistemi firlatma aninda, 
geminin arkasinda bulunan ana motorlar ve iki firlatici roket ateslenir. Bu islemin 
sonunda, otuz milyon newton'luk cok buyuk bir firlatma kuvveti, sistemi havalandirir. 
Havalandiktan bir dakika sonra sistemin surati, ses suratini asar. Bu sirada geminin 
icinde olsaniz ve kendinizi tartsanis, yeryuzunde 60 kilo gelen vucudunuzun, iki dakika 
incinde sismanlamis olmamasina karsin, 180 kilo geldigini gorursunuz. Bu ilginc 
durum, aracin ivmesinin, cekim ivmesinden uc’ kat  fazla olmasindan 
kaynaklanmaktadir. Havalandiktan sonra kati yakitli roketlerin yakitlari biter ve dis 
yakit tankindan ayrilirlar. Bu anda gemi, 50 km. Yukseklikte ve hizi Saatte 5.000 
km’ye ulasmistir. Ayrilan roketler, ilk hizlarindan dolayi derhal asagiya dusmezler. 50 
km‘de ayrilan bu roketler, 67 km’ye dek cikar ve sonra dusmeye baslar. Duserken, 
yuzeyden yaklasik 3 km. Yukseklikten, uc evreli parasut sistemi calisir ve dususun 
hizini azaltir. Denize dusen roketler, su yuzeyine degdikleri anda parasutlerden ayrilir 
ve alt tarafta bulunan ozel bolmeler siserek, roketlerin batmamalari saglanir. Daha 
sonra bunlar denizden toplanir. Gerekli onarim ve bakim yapilarak, bir sonraki ucus 
icin hazirlanirlar. Bu kati yakitli roketlerin kalkistaki agirligi, yaklasik 580 tondur ve 
11.800.000 newton‘luk bir itme meydana getirmektedir. Uzunlugu 45.5 metre, silindirik 
govdenin capi ise 3.7 metredir. Uzay gemisinin ana motorlarina yakit veren buyuk dis 
tank ise yerden 200 km. Yukseklikte iken yakiti bittiginde aractan ayrilir. 20 katli bir 


apartman yuksekliginde (50 m.) Olan bu buyuk silindirik tankin capi 30 metredir. 
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Yapimi icin 30 ton aluminyum kullanilan bu tankin bir kez kullanilmasi, bir cok kisinin 
NASA’yi elestirmesine neden olmaktadir. Cunku mekikten ayrilan tank, daha sonra 
dunya atmosferine girerek yanmaktadir. NASA muhendisleri bu tanklardan nasil 
yararlanacaklarini dusunmektedirler. Hazirlanan bu projeye gore, 1990’dan sonra 
kurulmasi beklenen uzay istasyonunun, bu tanklardan yirmisinin bir araya getirilerek 
yapilmasi onerilmaktedir. Martin Marietta Aeorospace  sirketi’nin  gelistirilmis 
programlar baskani olan Frank Williams’a gore gemi, tankini uzayda biraz daha sonra 
birakacak. O zaman tank, yer atmosferine dusmeyecek, gemiyi izleyerek istenen 
yorungeye oturtulmasi saglanacak. Deneylerin yapilacagi ve incinde rahatca 
yasanilabilecek saglamlikta olan bu silindirler uc uca eklendiginde, istenen uzay 
istasyonunun hem daha kisa zamanda, hem de daha ekonomik bir sekiled yapilabilecegi 
ileri surulur. Uzay gemisinin on govdesi ve murettebat bolumu, aluminyumdan 
yapilmis uc kattan olusmaktadir. En ust katta, yorunge aracinin dendisini, tum uzay 
gemisi sistemini ve tasinan yuku yoneten, deneteleyen kumanda sistemi yer almaktadir. 
Bu katta, uc astronot iskemlesi bulunmaktadir. Orta kat, ucus zamani tasima ve yasam 
bolumu olarak ayrilmistir. Ayrica bu bolum, geminin yuk tastyan dargo bolumu ile 
baglantilidir. Alt katta ise cevre kontrol gerecleri yer almaktadir. Geminin orta 
bolumu, yuk tasiyan kargo bulumudur ve uzaya giderken ustten acilan iki kapak ile 
ortulmektedir. Uzayda bu kapaklar acilarak, uydulari yorungeye oturtmak, vuruvus 
vapmak gibi cesitli gorevler yerine getirilmektedir. Arka govde ve motor yuvalarini 
tasivan son bolum, yorunge aracinin en karmasik parcasidir. Sadece 8 dakika surevle 
ateslenen ve yorungeve erismezden once 6 milyon newton luk firlatma kuvveti varatan 
uc ana motor bu bolumdedir. Ana motorlar sustuktan sonra gemiyi yorungesine 
oturtan iki roketten olusan vorunge manevra sistemi de bu arka bolumdedir. Son 
Olarak bu bolumde 38’I ana, 6’si duvarli olmak uzere toplam 44 kucuk roketten 
olusmus, tepki - denetim sistemi vulunmaktadir. Bu sistem, aracin (yorunge icinde 
kalma kosulu ise) konumu ve uc ekseni bovunca donme hareketleri saglamaktadir. 
Yukarida kisaca ozelliklerini tanitmaya calistig inuz uzay gemisi ilk uzay ucusunu, 3 
yillik gecikmeden sonra, 1981 yilinda yapti. Ucusa hazirlanan 4 uzay genusinden ilk 
vapilani, Colombia adini tasivordu. Ucus komutani ve pilot, ilk gem sevrinin 
personelivdiler. 12 nisan 1981 Colombia Florida’daki firlatma ussunden havalandi. 
Dunya ceversinde 36 tur atan genu Kalkistan 54.5 saat sonra, 14 nisan gunu yeryuzune 
dondu. Ucus basarili gecmisti ama; gemiyi yuksek sicaktan koruvan favanslari onemli 


derecede hasara ugramisti. Hasar nedem olan sicakhk, ozellikle arac dunya’ya 
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donerken, atmosferdeki surtunmeden daynaklaniyordu. Ikinci ucus, 14 kasin 1981 
gunu gerceklestirildi. Bes gun olarak dusunulen ucus programi yarida desildi ve gemi 
iki gun sonra yeryuzu’ne dondu. Bu ucusunda hava kirliligi, deniz arastirmalari gibi bir 
takim bilimsel arastirmalar yapildi. Ayrica, kanadalilarin yaptigi herhangi bir yone 
dogru 15.6 metre uzanabilen, gemi disindaki bir nesneyi tutmak icin veya icindeki bir 
aleti tutup uzaya birakabilmek icin kullanabilecek, kiminin vinc, kiminin robot, 
bazilarinin da mekanik kol dedigi birimi denediler. Bu ucusta gemi, birinciye gore daha 
az hasara ugramisti. Ucuncu ucus, 22 mart 1982 gunu basladi ve ilk kez sekiz gun 
surdu. Gemi, planlanan seyrini bir gun gecikmeyle 30 mart’ta tamamladi. Bu seyirde, 
komutan ve pilot, normal calismalarin yani sira, bir cok seyle de ugrastilar. Bunlar 
uzay tutmasi, radyo arizalari, tikanmis tuvalet, lumbuzlardaki kiragi, arizali radar 
ekrani ve uykusuzluktu. Fakat herseye karsin, cok basarili bir seyirdi. Astronotlar, 
geminin sadece bir yuzunu daima gunes’e cevirerek birkac saat isittilar, dogal olarak 
diger taraf da dondu. Boylece geminin isisal ozellikleri saptanmis oldu. Mekanik kola 
yerlestirilen bir cihazla, uzay gemisi cevresindeki parcaciklar ve elektrik alanlari olculdu. 
Mekanik kolun hareketini surekli denetim altinda tutmak icin kol uzerine yerlestirilen 
televizyon kamerasi arizalanica, personel ayni isi yapabilmek icin bildigimiz avci 
durbunu kullanmak zorunda kaldilar. [lk ucus gununun sonunda, yeryuzu’nden 
havalanirken lumbuz koruyucusunu kiran beyaz maddenin, geminin bas kismindan 
kopan isi koruyucu oldugunu kesfettiler. Personel ilk gun hicbir sey yiyvemedi. Avyrica 
pilot, agirliksiz ortama alisamadigindan uvuyamadi; dolavisiyla da ikinci gun cok 
yorgun dusmustu. Bu durumu pilot su sozlerle dile getirtyordu: “kendimi, sanki her on 
dakikada bir maraton kosuyormus gibi hissettim.” Bu sevirde ayrica ari, pervane, ve 
sineklerden olusan hayvanlarin, agirliksiz ortamda davranislari incelendi. Arilar 
ucmaktan yorulduklarinda, amacsiz bir sekilde olduklari yere donuvyorlardi. Gemi 
dunya ya dondugunde tum arilar olmustu. Pervaneler cilgin bir sekilde kanat cirptilar; 
sinekler hep yuruduler. Pilot ucmak icin calisan bir sinegi asla gormedigini soyluvordu. 
Inisin yapilacagi Edwards hava kuvvetleri ussu’ndeki kuru gol yatagi mevsimin de 
etkisiyle inis gunu iyice islanmisti. Bu nedenle, inis orava degil de, New Mexico’daki 
limana yapildi. Fakat inisin yapilacagi gun kuvvetli bir firtina patlamis ve inisin 
vapilacagi alan, seyirdeki gemiden dahi rahatca gorulebilinen bevaz bir toz bulutu 
altinda kalmisti. Bu nedenle ucus bir gun geciktirildi. Dorduncu ucus, 27 haziran - 4 
temmuz 1982 arasi gerceklestirildi. Bu seyir digerlerinden iki vonden farklhivd1. 


Birincisi, askeri amacli yuk tasiyordu. Hava kuvvetleri yukun ne oldugunu aciklamadi. 
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Fakat bu gizli yukun, kirmiziotesi arama ve tarama yapan bir alet oldugu biliniyordu. 
Ikinci1 farkli yon, ogrencilerin hazirladigi 90 kg. Agiriligindaki deney paketinin 
tasinmasiydi. Bu seyirde yapilan bir baska deney de bazi biyolojik materyalin 
birbirlerinden ayrilmastydi. Deneyi yapan alet, bu materyal karisimi bir elektrik alana 
koyuyor ve onlari dogal elektrik yuklerine gore secebiltyordu. Dunya ustunde bu 
islemi, yercekimi etkilemekte elektrik yuku, kicaklik ve calkantiya neden olmakta, 
dolayisiyla da miateryal tekrar birbirine karismaktadir. Uzayda bu materyalleri 
birbirinden ayirmanin, 800 kez daha etkin oldugu ortaya cikarildi. Bu son deneme 
ucusuydu. Bundan sonraki ucuslar, normal ticari amacli olacakti. Dorduncu ucusta 
basariya ulasamayan en onemli nokta, Kati yakith roketlerin parasut mekanizmasinin 
arizalanmasi ve her biri 7 milyar tl’na mal olan bu roketlerin deniz dibini boylamasiydi. 
Besinci ucusun personel sayisi, ilk kez ikiden fazla oluyordu. Ucus komutani ve 
pilottan baska, William ve Joseph adli iki astronot da ucus uzmani olarak gemide yer 
aldilar. Gemunin ilk ticari yuku olan iletisim uydulari 11 kasim 1982 gunu baslayan bu 
seferde basariyla yorungeye oturtuidu. Eger bu uydular yerden yorungete 
yerlestirilseydi, uydu sahipleri daha fazla para odemek zorunda kalacaklardi. Bu 
seyirde personeli uzay tuttu. Bu yuzden uzayda yuruvus izlencesi bir gun ertelendi. 
Ertesi gun ise her biri varim milvar tl’na mal olan uzay melbusati arizalandi. Tum 
ugraslara karsin arizalar giderilemedigi icin yuruyusten vazgecildi. Fakat bu cok onemli 
bir deneydi; cunki gelecekte uzay limani gibi buyuk vyapilar insa edilirken, bu techizat 


ile arac disi calismalar vapilacak. 
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APPENDIX B 
THE LISP PROGRAM 


(denfun huffman (P) 

(sortcar (asign (arrange (mapcar ‘list P))) ‘greaterp)) 
(denfun arrange (Q) 

(cond ((null (cdr Q)) Q) 

(t (arrage (insert (list (add (caar Q) (caadr Q)) 

(car Q) (cadr Q)) 

(denfun insert (x Q) 

(cond ((null Q) (cons x Q) 

((lessp (plus (car x) epsilon) (caar Q)) (putin N x Q)) 
(t (cons (car Q) (insert x (cdr Q)) )) )) 


(denfun putin (n x L) 

(cond ({zerop n ) (cons x L)) 

((null L ) (list x)) 

(t (cons (car L) (putin (subi n) x (cdr L)))))) 
(defun assign (QO (split nil (car Q)) 0 


(defun split (c 1) 

(cond ((null (cdr L)) (list ( list (car L) c)) ) 
(t (append (split (cons | c) (cadr L)) 

(split (cons 0 c) (cadr L)) )) )) 


(defun sortcode (L) 

(cond ((null L) nil) 

(t (inscode (caar L) (cadar L) (sortcode (cdr L)) )) )) 
(defun inscode (p c L) 

(cond ((null L) (list (list p c)) ) 

((greaterp (length c) (length (cadar L))) 

(cons (list p (cadar L)) (inscode (caar L) c (cdr L)) )) 


(defun totlength (L) 
(cond ((null L) 0) 


101 


{t (add (times (caar L) (length (cadar L)) ) 
{totlength (cdr L)) )) )) 


(defun avglength (L) 
(qoutient (times 1.0 (totlength L)) 
{apply ‘add (mapcar ‘car L)) )) 


(defun varlength (L) 
(quotient (times 1.0 (varlength2 L (avglength L))) 
(apply ‘add (mapcar ‘car L)))) 


(defun varlength2 (L mu) 

(cond ((null L) 0) 

(t (add (times (caar L) 

(expt (differnce (length(cadar L)) mu) 2)) 
(varlength2 (cdr L) mu))))) 


(defun Zipf (n) 
(cond (({zerop n) nil) 
(t (cons (quotient 1.0 n) (Zipf (- n 1)) )) )) 


(defun tryN (n e) 

(set 'N n) 

(set ‘epsilon e) 

(set ‘code (sortcode (huffman Turkish)) ) 

(print (list “N “= n ‘epsilon ‘=e)) 

(pp code) 

(print (list ‘mean “= (avglength code))) (terpr) 
(print (list ‘variance ‘= (varlength code))) (terpr)) 
(set ‘Prob (20 25 33 50)) 


(set “Turkish 

(0 6 6 17 28 34 39 45 45 56 

61 67 67 73 73 84 84 89 112 134 

162 196 358 581 687 872 989 1017 1224 1637 
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1883 2185 2660 2682 2945 3213 3509 3861 3984 5130 
5163 6085 6611 7952 9427 10528 13339)) 


(set ‘N 0) 
(set ‘epsilon 0) 
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APPENDIX C 
PASCAL COMPUTER PROGRAM WHICH DROPS THE SYMBOLS 


Program Somethihg (INPUT, OUTPUT) 
VAR 


ch: CHAR 
X 
INTEGER; 


BEGIN 
x= 0 
WHILE not EOF DO 


Begin 

READ (ch ) 

IF ( ch = ° any symbol to be dropped’) 
Sek 


ELSE 
WRITE (ch) 
End 


WRITEEN 
WRITELN: (° The Number of the Dropped Svmbols 1s ’, x) 


END. 
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APPENDIX D 
SHORT TURKISH ARTICLES 


1. THE FIRST ARTICLE 

Genel Bilgiler : 1. Yabancilar ve yurt disinda calisan Turkler girislerinde beyan 
etmek kosuluyla 3,000Amerikan dolari veya esitini asan dovileri beraberlerinde yurt 
disina cikarabilirler. 2. Yolcular encok 1,000 Amerikan dolari karsiligi Turk parasini 
yurt disina cikarabilirler. 3. Yolcular kendilerine ait 3,00 Amerikan dolarini asmayan 
ziynet esyalarini giriste beyan edilmek sarti ile yurt disina goturebilirler. 4. Yolcular 
degerine bakilmaksizin, gumruk mevzuatina uygun, sahsi, ailevi, mesleki ve trustik 
nitelikteki esyalari beraberlerinde goturebilirler. [Ref. 9] 


2, THE SECOND ARTICLE 

Kisisel Esya : 1. Yolcunun giyinip kusanmasina, kullanmasina, suslenmesine ait ( 
ic camasirlair, gomlek, kravat, elbise, palto, manto, sapka, ayakkabi, toka dugme, kupe, 
bilezik, yuzuk, birer adet cep ve kol saati, medil, corap, pijama, perdesu, semsiye gibi 
esya ile yurt disinada iki yil veya daha fazla kalip Turkiye’vye kesin donen kisinin bir 
adet kurkten mamul giyim esyasi). 2. Yolcunun okumasina ve yazmasina ait esya 
{kitap, dergi, kursun kalem, kagit, defter, kristal, gumus veya kivmetli madenlerden 


olanalr haric yazi takimi gibi ). 3. Bir adet portatif yazi makinasi. [Ref. 9] 


3. THE THIRD ARTICLE 

Tip bilimi, kadin ile erkek arasinda bir ustunluk sorunu degil, sadece bir 
“farklilik” oldugunu soyluyor. 18 yasindaki bir kadin (yada bir kiz), ayni vastaki bir 
erkekten ortalama 10 santimetre daha kisa ve gene ortalama 13 kilo daha hafif. 
Erkeklerde adele yapisi vucud agirliginin yvuzde kirk kadarini olustururken, bu oran 
kadinlarda yuzde 23 dolavinda kaliyor. Bir baska acidan bakilirsa, kadin vucudunda 
erkege nazaran yuzde 10 kadar fazla yag dokusu var. Ve, hepsi bu. Bu farkliliklar, 
Kadinin dayaniklilik gerektiren sporlarda erkeklere gore daha avantajli olmalarini 
sagliyor. Buna karsilik erkeklerde adele gucu isteyen sporlarda daha basarili.Maratonda 
erkek ile kadin arasindaki der ece farklarinin 2,000 yili civarinda ortadan kalkacagi 


saniliyor. Mansi gecerken en iyi 10 derece yapmis olan yuzuculerden 8'i Kadin. 
[Ref: 10] 
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4. THE FOURTH ARTICLE 

Maraton parkuru tam 42,195 metre tutuyor. Kosunun asagi yukari 10. 
kilometresinde, vucud, sinirlerinden erdorfin denilen bir madde salgilayarak kan 
dolasimina katmaya basliyor. Endorfin, bunyenin olusturdugu dogal bir uysturucu. 
Gorevide, vucud biolojik bir olum kalim savasi verirken aci duygusunu onleyip 
mucadelenin surmesini saglamak. Tip biliminin elde ettigi bulgulara gore, maraton 
kosucularinin yuzde 3 kadari, bir sure sonra bu maddenin kesin tutkunu haline 
geliyorlar. Kosmadiklari taktirde, tipki eroin muptelalari gibi, uykulari kaciyor, 
saldirgan bir tavir aliyorlar, hatta bunalim, korku gibi surekli ruh bozukluklari 
gosterenlere bile rastlaniyor. Digerlerinde de, aci hissi ortadan kalktigindan dolayi, 
"kendini cok iyi hissetme” hali gozleniyor. [Ref. 11] 
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APPENDIX E 
REWRITTEN ARTICLES AT STEP 6 


1. THE FIRST ARTICLE 

Genel bilgiler : 1 Yabancilar e yurt disinda calisan Turkler girislerinde beyan 
etmek kouluyla 3000 Amerikan dolari eya esitini asmayan doileri beraberlerinde yurt 
disina cikarabilirler 2 Yolcular encok 1000 Amerikan dolari karsiligi Turk arasini yurt 
disina cikarabilirler 3 Yolcular kendilerine ait 3000 Amerikan dolarini asmayan iynet 
esyalarini beraberlerinde yurda getirebilirler veya yurt disina cikarabilirler Kiymeti 3000 
Amerikan dolarindan yukari olan iynet esyalari giriste beyan edilmek sarti ile yurt 
disina goturulebilir 4 Yolcular sasi, ailei, mesleki e turistik nitelikteki esyalari 
beraberlerinde goturebilirler 


2. THE SECOND ARTCLE 

Kisisel esya : 1 Yolcunun giyini kusanmasina, suslenmesine ait esya ic 
camasirlari, gomlek, kraat, elbise, alto, manto, saka, ayyakkabi, toka dugme, kue, 
bileik, yuuk, birer adet ce e kol saati, mendil, cora, 1ama, erdusu, semsiye gibi esya ile 
yurt disinda iki yil eya daa ala kali Turkiye’ye kesin donen kisinin bir adet kurkten 
mamul giyimesyasi1 2 Yolcunun okumasina e yamasina ait esya kita dergi, kursun e 
murekkeli kalem, kagit, deter, kristal, gumus e kiymetli madenlerden olanalar aric yai 


takimi gibi 3 bir ade ortati yai makinasi 


3. THE THIRD ARTICLE 

Ti bilinu, kadin ile erkek arasinda bir ustunluk sorunu degil, sadece bir arklilik 
oldugunu sovluyor 18 yasindaki bir kadin yada ki, ayni yastaki bir erkekte ortalama 10 
santimetre daa kisa e gene ortalama 13 kilo daa ai Erkeklerde adele vaisi ucud 
agirligini yude 40 kadarini olustururken, bu oran kadinlarda yude 23 dolayinda kaliyor 
Baska bir acidan bakilirsa, kadin ucudunda erkege nazaran yude 10 kadar ala vag 
dukusu ar e, esi bu Bu arkliliklar, kadinin dayaniklilik gerektiren sorlarda erkeklere gore 
daa aantali olmalarini sagliyor Buna karsilik erkeklerde adele gucu isteyen sorlarda daa 
basarili Maratonda erkek ile kadin arasindaki derece arklarinin 2000 yili ciarinda 


ortadan kalkacagi saniliyor Mansi gecen en ivi 10 derece yamis olan yuuculerden 8 1 
kadin 
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4. THE FOURTH ARTICLE 

Maraton arkuru tam 42195 metre tutuyor Kosunun asgi yukari 10 
kilometresinde, ucud, sinirlerinden endorin denilen bir madde salgilayarak kan 
dolasimina katmaya basliyor Endorin,bunyenin olusturdugu dogal bir uyusturucu 
Goreide, ucud biolojik bir olum kalim saasi erirken aci duygusunu onleyi mucadelenin 
surmesini saglamak Ti bilimin elde ettigi bulgulara gore, maraton kosucularinin yude 3 
kadar, bir sure sonra bu maddenin kesin tutkunu aline geliyorlar Kosmadiklari 
taktirde, tiki eoin mutelelar gibi, uykulari kaciyor, saldirgan bir tair aliyorlar atta 
bunalim, korku gibi surekli ru boukluklari gosterenlere bile rastlaniyor Digerlerinde de 
aci issi ortadan kalktigindan dolayi, kendini cok 1yi issetme ali golentyor 

The Number of the Dropped Symbols is : 140 
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APPENDIX F 
SIMULATION PROGRAM WITHOUT DROPPING PROCESS 


Program Buffersize (input, output) 
Var 

CH: Char 

X: integer 

MAXBUF, BUFI, INPUTR, OUTR, LE, BUF: real 
BEGIN 

LE:= 0.0 

BUF := 0.0 

MAXBUF:= 0.0 

BUFL: = 1.0; 

x:=0 

INPUTR := 6.0 

OUTR := 4.01359 
WHILE not EOF DO 
BEGIN 

READ (ch) 

Xi= x+]1 

IF (ch=’l’) THEN 

LE:= 

ELSE IF (ch=’A’) THEN 
LE:= 

ELSE IF (ch=’E’) THEN 
LE := 

BESEIF (ch="N’) THEN 
LE:= 

ELSE IF (ch=’R’) THEN 
LE:= 

ELSE IF (ch='U’) THEN 
LE := 

Eeoe lr (ch='L’) THEN 
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LE:= 
ELSE IF (ch=’S’) THEN 
LE := 
ELSE IF (ch=‘K’) THEN 
LE:= 
ELSE IF (ch='D’) THEN 
LE := 
ELSE IF (ch=‘T’) THEN 
LE := 


ELSE IF (ch=’M’) THEN 
LE:= 

ELSE IF (ch=‘Y’) THEN 
LE:= 

ELSE IF (ch=’0’) THEN 
LE := 

ELSE IF (ch=’G’) THEN 
FE 

ELSE IF (ch='B’) THEN 
LE:= 

ELSE IP(ch= €orREN 
LE := 

EESE- LF tch=", ) TEEN 
LE:= 

ELSE IF {(ch= .) THIEN 
LESS 

ELSE IF ich= Z} THEN 
LE:= 

ELSE (ch =) then 
LE:= 

EVSE 1F (ch—?) PelEN 
LE:= 

ELSE 1k (ch ia EN 
LE:= 

EGSE lr (ch= a aaeN 
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LE := 
ELSE IF (ch= 0’) THEN 


LE := 

ELSE IF (ch=*"’) THEN 
LE:= 

ELSE IF (ch=‘1’) THEN 
LE := 

ELSE IF (ch=’"’) THEN 
LE := 

ELSE IF (ch=’2’) THEN 
LE := 

ELSE IF (ch=’)’) THEN 
LE := 

ELSE IF (ch=’5’) THEN 
LE := 

ELSE IF (ch=’3’) THEN 
LE := 

ELSE IF (ch=°8’) THEN 
LE:= 

ELSE IF (ch= (1) THEN 
LE := 

ELSE IF (ch=‘4’) THEN 
LE:= 

ELSE IF (ch=’’) THEN 
LE := 

mist [F (ch= 9) THEN 
LE:= 

ELSE IF (ch=’J’) THEN 
LE:= 

mise [fF (ch='6') THEN 
LE:= 

ELSE IF (ch=’W’) THEN 
LE := 


ELSE IF (ch=’:’) THEN 


eG 


LE:= 

ELSE IF (ch=‘7’) THEN 

LE:= 

ELSE IF (ch=’-’) THEN 

LE := 

ELSE le (cn= 7) UinteN 

LE:= 

ELSE IF (ch=°X’) THEN 

LE:= 

ELSE IF (ch=’Q’) THEN 

LE:= 

BUF := BUF + LE 

BUF := BUF -OUTR 

IF (BUF < 0.0 ) THEN 

BUF := 0.0 

BUF1 := BUF 

IF (MAXBUF < BUFI ) THEN 

MAXBUF := BUFI 

ELSE 

MAXBUF := MAXBUF 

END 

END 

Lo Ney 

WRITEEN( BUBEER “SBE 

WRITELN ( REQUIRED BUFFER SIZE FOR, X, CHARACTERS Hes 
MAXBUF) 

WRITELN ( OUTPUT RATE IS’, OUTR) 

WRITELN ( INPUT RATE IS ’, INPUTR) 

END. 

SENTRY 

( MESSAGE ) 


he 


APPENDIX G 
SIMULATION PROGRAM WITH DROPPING PROCESS 


Program Buffersize (input, output) 

Var 

CH: Char 

ZN, A: integer 

MAXBUF, BUFI, INPUTR, OUTR, LE, BUF: real 
BEGIN 

LE:= 0.0 

BUF := 0.0 

MAXBUF:= 0.0 

BUFI: = 1.0; 

x:=0 

z:= 0 

y:= 0 

INPUTR := 6.0 

OUTR := 4.01359 

WHILE not EOF DO 

BEGIN 

READ (ch) 

IF (ch=’Q’) or (ch=’X’) or (ch=’?’) or (ch=’-’) or (ch=:’) or 
(ch= J’) or (ch=’ ’) or (ch= (’) or (ch=’)’) or (ch=”’) or 
(ch=°") or (ch=’F’) or (ch=‘H’) or (ch=’P’) or (ch=’V’) or 
(ch=‘Z’) or (ch=’.’) or (ch=’,’) or (Ch=’W’) THEN 

X= xt+1 

EESE 

BEGIN 

IF (ch=‘1’) THEN 

LE:= 

ELSE IF (ch=’A’) THEN 

LE:= 

ELSE IF (ch=’l’) THEN 
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LE:= 
ELSE IF (ch=’E’) THEN 
LE := 
ELSE IF (ch=‘N’) THEN 
LE := 
ELSE IF (ch=’R’) THEN 
LE := 
ELSE IF (ch=’U’) THEN 
LE:= 
ELSE IF (ch=’L’) THEN 
LE:= 
ELSE IF (ch=’S’) THEN 
bE = 
ELSE-AP (ch="K)) tiatEN 
LE:= 
ELSE IF (ch=’D’) THEN 
LE:= 
ELSE IF (ch=’T’) THEN 
LE:= 


ELSE IF (ch=’M’) THEN 
LE := 

ELSE [P(ch=-4) Ten 
LE := 

ELSE ADE (ch= Oost EN 

LE:= 

BESE Le (ch—= GG) LELEN 
LE{= 

BPESE IlPich= B) THEN 

Es 

ELSE IF (ch='0’) THEN 

LE := 

BSE TP ich) ries 

LE:= 

ELSE IF (ch='2)) THEN 
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LE := 

ELSE IF (ch=’3’) THEN 

LE:= 

ELSE IF (ch=’5’) THEN 

LE := 

ELSE IF (ch=’8’) THEN 

LE:= 

ELSE IF (ch=’4’) THEN 

LE:= 

ELSE IF (ch=’9’) THEN 

LE:= 

ELSE IF (ch=‘6’) THEN 

LE := 

ELSE IF (ch=’7’) THEN 

LE := 

BUF := BUF + LE 

BUF := BUF-OUTR 

IF (BUF < 0.0 ) THEN 

BUF := 0.0 

BUFI := BUF 

IF (MAXBUF < BUFI ) THEN 

MAXBUF := BUFI 

ELSE 

MAXBUF := MAXBUF 

END 

END 

ZLi=xX-Y 

WRITELN (’ BUFFER’, BUF) 

WRITELN (’ REQUIRED BUFFER SIZE FOR’, Z, ’ CHARACTERS IS’, 
MAXBLEF) 

WRITELN (’ TOTAL NUMBER OF CHARACTERS IS’, X ) 

WRITELN (’ NUMBER OF DROPPED SYMBOLS IS’, Y) 

WRITELN (’ OUTPUT RATE IS’, OUTR) 

WRITELN (° INPUT RATE IS ’, INPUTR) 
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END: 
SENTRY 
( MESSAGE ) 
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APPENDIX H 
REWRITTEN ARTICLES AFTER DROPPING I AND N 


1.’ THE FIRST ARTICLE 

geel blgler : 1. Yabaclar ve yurt dsda calsa Turkler grslerde beya etmek kouluyla 
3000 Amerka dolar vaya est asmayan dovler beraberlerde yurt dsa ckarablrler. 2. 
Yolcular ecok 1000 Amerka dolar karslg Turk paras yurt dsa ckarablrler. 3. Yolcular 
kedlere at 3000 Amerka dolarda yukar ola zyet esyalar grste beya edlmek sart le yurt 
dsa goturuleblr. 4. Yolcular, sahs, alev, meslek ve turustk telktek esyalar beraberlerde 


beraberlerde gotureblrler. 


2, THE SECOND ARTICLE 

Kssel Esya : 1. Yolcuu gyp kusamasa, suslemese at esya ( c camasrlar gomlek, 
kravat, elbse, palto, mato, sapka, ayyakkab, toka, dugme, kupe, blezk, yuzuk, brer adet 
cep ve Kol saat, medl, corap, pjama, perdesu, semsye gb esya le yurt dsda k yl veya 
daha fazla kalp Turkye’ye kes doe ks br adet kurkte mamul gym esyas ) 2. Yolcuu 
okumsasa ve yazmasa at esya (ktap, derg, kursu ve murekkepl kalem, kagt, defter, 
krstal, gumus ve kymetl maddelerde olalar harc yaz takm gb) 3. Br adet portatf yaz 


makas. 


3. THE THIRD ARTICLE 
Tp blm, kad le erkek arasda br ustuluk soruu degl, sadece br “farkllk” olduguu 
sovluyor. 18 yasdak br kad (yada kz), ay vastak br erkekte ortalama 10 satimetere daha 
ksa ve gene ortalama 13 klo daha haff Erkeklerde adele yaps vucud agrlg yuzde krk 
kadar olustururke, bu ora kadlarda 
yuzde 23 dolavda kalyor. Baska br acdan baklrsa, kad vucududa erkege nazara vuzde 
10 fazla yag dokusu var. Ve, heps bu. Bu farkllklar, kad dayakllk gerektre sporlarda 
erkeklere gore daha avatajl olmalar saglyor. Bua karslk erkeklerde adle gucu steve 
sporlarda daha basarl. Maratoda erke le kad arasdak derece farklar 2000 yl cvarda 


ortada kalkacag salyor. Mas gece e y 10 derece yapms ola yuzuculerde 8 i kad. 


4. THE THIRD ARTICLE 
Marato parkuru tam 42,195 metre tutuyor. Kosuu asg yukar 10. klometresde, 


vucud slerde edorf dele br madde salglayarak ka dolasma katmaya baslyor. Edof, buye 
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olusturdugu dogalbr uysturucu. Gorede, vucud bolojk br olum kalm savas verrke ac 
duygusuu oleyp mucadele surmes saglamak. Tp blm elde ettg bulgulara gore, marato 
kosucular yuzde 3 kadar, br sure sora bu madde kes tutkuu hale gelyorlar. Kosmadklar 
taktrd tpk ero muptelalar gb, uykular kacyor, saldrga br tavr alyorlar. Hatta bualm, 
korku gb surekl ruh bozukluklar gosterelere ble rastlayor. Digerlerde de, ac hss ortada 
kalktgda dolay “kend cok y hssetme” hal gozleyor. 

The Number of the Dropped Symbols is : 448 
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APPENDIX I 
REWRITTEN ARTICLES AFTER DROPPING THE SYMBOLS 
COMBINATION 

1. THE FIRST ARTICLE 

geel blgler 1 Yabaclar ve yurt dsda calsa Turkler grslerde beya etmek kouluyla 
3000 Amerka dolar vaya est asmayan dovler beraberlerde yurt dsa ckarablrler 2 
Yolcular ecok 1000 Amerka dolar karslg Turk paras yurt dsa ckarablrler 3 Yolcular 
kedlere at 3000 Amerka dolarda yukar ola zyet esyalar grste beya edlmek sart le yurt 
dsa goturuleblr 4 Yolcular sahs alev meslek ve turustk telktek esyalar beraberlerde 


beraberlerde gotureblrler 


2. THE SECOND ARTICLE 
Kssel Esya 1 Yolcuu gyp kusamasa suslemese at esya c camasrlar 
gomlek kravat elbse palto mato sapka ayyakkab toka dugme kupe blezk yuzuk brer 
adet cep ve kol saat medl corap pjama perdesu semsye gb esya le yurt dsda k yl veya 
daha fazla kalp Turkyeye kes doe ks br adet kurkte mamul gym esyas 2 Yolcuu 
okumsasa ve yazmasa at esya ktap derg kursu ve murekkKepl kalem kagt defter krstal 


gumus ve kymetl maddelerde olalar harc yaz takm gb 3 Br adet portatf yaz makas 


3. THE THIRD ARTICLE 

Tp blm kad le erkek arasda br ustuluk soruu degl sadece br farkllk olduguu 
soyluvor 18 yasdak br kad yada kz ay yastak br erkekte ortalama 10 satimetere daha 
Ksa ve gene ortalama 13 klo daha haff Erkeklerde adele yaps vucud agrlg yuzde krk 
kadar olustururke bu ora kad a yuzde 23 dolayda kalvor Baska br acdan baklrsa kad 
vucududa erkege nazara yuzde 10 fazla yag dokusu var Ve heps bu Bu farkllklar kad 
davakllk gerektre sporlarda erkeklere gore daha avatajl olmalar saglyor Bua karslk 
erkeklerde adle gucu steye sporlarda daha basarl Maratoda erke le kad arasdak derece 
farklar 2000 yl cvarda ortada kalkacag salyor Mas gece e y 10 derece yapms ola 


yuzuculerde 8 1 kad 


4. THE THIRD ARTICLE 
Marato parkuru tam 42195 metre tutuyor Kosuu asg yukar 10 klometresde vucud 
Slerde edorf dele br madde salglavarak ka dolasma katmaya baslvor Edof buve 


Olusturdugu dogal br uysturucu Gorede vucud bolojk br olum kalm savas verrke ac 
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duygusuu oleyp mucadele surmes saglamak Tp blm elde ettg bulgulara gore marato 
kosucular yuzde 3 kadar br sure sora bu madde kes tutkuu hale gelyorlar Kosmadklar 
taktrd 
tpk ero muptelalar gb uykular kacyor saldrga br tavr alyorlar Hatta bualm korku gb 
surekl ruh bozukluklar gosterelere ble rastlayor Digerlerde de ac hss ortada kalktgda 
dolay kend cok y hssetme hal gozleyor 
The number of the dropped symbols 1s 544 
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